Hi,
I really like the approach and will perhaps try to setup something similar because I know each piece of technology involved (nodejs, heroku, dialogflow, socketio).
The biggest problem to address here, is how to make it useable for the end user. There are multiple scenarios and explanations covered in this old document (https://fileserver.tk.informatik.tu-darmstadt.de/Publications/2006/EuroPLoP2006.pdf) but most of the problems and comments are still valid more than 10 years after and this is the major blocker for me regarding vui. In almost 100% of the cases, standard UI with screen, touch and/or keyboard and mouse are more easy and quicker to use and the document clearly explain why.
One thing that could be useful would be to elaborate generic design patterns at least for the actions that we can see in the majority of the web site (fill a form, navigation, etc..). Design patterns in this context would mean indents & entities in Dialogflow as well as the corresponding events exposed and consumed by the web site.
