How Zendesk Serves TensorFlow Models in Production
Wai Chee Yau

Hi Wai,
Great article !

Was trying out the tensorflow serving to host pre-trained inception on a google cloud (1 GPU K80,4 CPU machine with 15GB RAM). Saw that serving was using the Cuda capabilities.
Got a response on my local for one image inference in nearly 12 secs ! Is that even possible or am I doing something wrong ?? What is the machine configuration that you use for Amazon EC2 instances and what approx response time do you get for a single request.

Thanks in advance !

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.