5 tips for integrating external services
It’s a common task to use external services to extend the functionality of your website. Some examples would be weather information or traffic data. But there are tons of other use cases. The best that can happen is if you can use client-side code. But most of the times you have to take the detour via your own server to protect some secret keys and request limits. There are several pitfalls, so let’s have a look on how to avoid some of them.
These tipps are ordered by the ease of implementation (I guess). So it may be that tips will be unnecessary when using later ones.
Most of the time, you will not need the latest service data on every request. Use the transient cache or meta fields to cache the service responses for some time. But cache it twice! Use one short living cache for every response that only keeps the data for a specific lifetime. Keep a second long-term cache for much longer (or forever), so that if the service does not respond as expected, you can still provide the last valid response.
This caching strategy will shorten most of the page load times and drastically reduce error displays for your visitors.
The repository is useful if you want to have a clean interface to access your data reliably. It will wrap all the functionality for caching and updating the data and makes it easy to use in the rest of your plugin. It will also be very helpful if you have to refactor your code later on for example if the service implementation changes.
Please note that this is not a clean (code) repository because it has a lot of side effects but anyways…
This is just an example. Use the location for caching that best fits your data and request arguments. However, stick to the principle of always using this repository class when you want to access your data.
3. Custom tables
Saving your service responses to the transient and meta fields might be easy to implement but it can result in performance issues later on. Especially if you collect a lot of data or you want to query posts by the data values.
One thing you can do is to create custom tables that fit the data structure to your services response much better. Think about including expiration fields as well so that you can eliminate the long-term cache and reduce some of the complexity of the service cache strategy.
Read more about the advantages of custom tables: https://medium.com/write-better-wordpress-code/do-not-use-post-meta-fec12a7661
4. Cron job
Even if you use a proper caching strategy, there will always come a moment when some user request has to wait for the service to respond. Not only does this increases the page load times, but impatient visitors can also cancel the request, which can lead to data integrity issues, or at least the next visitor will have to wait again.
With these risks in mind, and I think it’s just not a visitor’s job to pull data from a service, we should find another way to update the service’s data.
The best I can think of is to use a cron job that runs the updates. Depending on the expected data size which will be fetched there are two different update strategies.
Read more about how to optimize wordpress schedules (cron jobs): https://medium.com/write-better-wordpress-code/optimized-use-of-schedules-c18c3c1383e9
If you expect your updates to contain few service requests, it is perfectly fine to run them all on regular basis. Create a schedule task and fetch your data at any time interval. Just keep an eye on the service request limits and that the cron job finishes before the next starts.
If you expect a lot of service requests, think about how to batch them reasonably. Let’s say we have 100k posts and every post is connected to some external service data. It’s no solution to regularly fetch new data for all 100k posts. So one approach could be to only fetch new data for posts that are not inactive, meaning that they have traffic.
At this point, it really depends on your specific use case how you build the batch queue in a meaningful way. In the suggested example, this would mean that inactive posts data is stale until after the first new request that makes it active again. Maybe it could be useful to additionally implement direct service requests for very long expired data. But that depends on how important it is that the data is up to date, or is it more important that any (even old) data is available and downtimes doesn’t affect visitors?
Read the documentation of the service API carefully! If there are WebHooks that can push updates to your system, it’s always better to use those.
It’s really easy to create custom rest routes for your website using register_rest_route function. It can be complicated to implement the actual update and content id mapping. But most of the times it’s worth it because this will be the most efficient variant to update the service data.
Try to cache data and store it in appropriate places. Don’t clutter options or meta tables with too much data. If you process a lot of data, have a look for WebHooks. Alternatively, prioritize whether actuality or stability is more important for your use case.
Any objections or additions? I would be happy to discuss it with you.