> The whole issue is caused because Partition Tolerance has not been considered in the design
No, the whole issue was caused by using message queue where you shouldn’t use them or in a way unsuitable for this problem. If you want to handle situation where you group incoming messages due to the fact that they are not processed at all, use a database. This will come with a host of its own problems (you handle this specific case well at the expense of handling usual path), but its not a problem with message queues.
> A better way is for the customer service to poll the Order service periodically. Every 1/2 hour, a REST GET of ‘Give me a list of customers who have made orders since xxxx’. can be issued to the Order service.
Great, but now the Customer Service needs to know what Order Service is and how to talk to it, Order Service needs to worry about handling persistence (it didn’t have to store orders before, only process them. Storing was perhaps responsibility of Order Archive Service). Apply this pattern to 10 services and you’ll get n*(n-1)/2 connections between them that you wouldn’t have otherwise.