Network Traffic Analysis of Google Bard

Rakesh seal
Keysight ATI
Published in
6 min readApr 19, 2023

Google Bard is the latest addition to the growing competition of AI chatbots on the internet. It is developed to imitate conversations with humans, utilizing a mix of machine learning and natural language processing to provide practical and genuine responses to user inquiries. Initially launched as a web application within a particular geographic region, Bard has since gained immense popularity.

This blog will explain what is happening in the background when we are eagerly waiting for the answers to our questions and analyze the observed network traffic.

The Bard is accessible in the form of a web application. There are multiple Google domains seen in the overall network capture. Let’s break the network activities into 4 parts and analyze where each of the hosts are used and their characteristics-

Login Management:

The main Bard page is secured behind “ accounts.google.com”. The host is using QUIC Version-1. There are multiple request-responses seen on this host which is comparable to Google’s multi factor authentication.

The same host is seen at the end of the session when the user logs out.

Web Content

Once the user logs into the main site, the web content, including CSS, JS, and static content, begins to load. We have observed two primary hosts serving these contents:

1. fonts.googleapis.com — This host is utilizing QUIC Version-1 protocol, with a single QUIC stream being observed. A DNS request was also seen for this host.

2. gstatic.com — We have observed two hosts from the gstatic family- fonts.gstatic.com and www.gstatic.com. Both of these hosts are using QUIC Version-1 protocol as transport, with multiple QUIC streams observed. Leading DNS requests were also detected for the hosts.

Bard Chat Services

Chat is the main functionality in this service. The chat service is seen on a single host “ bard.google.com “. However, two separate kinds of request are observed in this functionality-

batchexecute

Fig 1: Initial batchexecute request query

When the chat is opened two batchexecute requests are observed. This is a POST request, and the URL consists of the path /_/BardChatUi/data/batchexecute.

It denotes the UI name of the webapp is BardChat and a batch style RPC request is used.

The query string has some interesting information -
1. RPCID — ID which contains which function will be called on the server.

2. bl — It is the name of the backend service handling the request. In the traffic we are analyzing this field exposes the backend web-server name and version.

3. _reqid — This is an number on each request. We have observed similarity between the IDs on each successive request. The IDs change in the following fashion-

42621 -> 1 42621 -> 2 42621 -> 3 42621 (continued). Interesting fact is that this reqid pattern holds true even if the type of request (streamgenerate) changes on same host.

4. rt — This value is used to specify response formatting.

Let’s look at the payload of this request-

Fig 2: batchexecute request payload

The request body is a form with type application/x-www-form-urlencoded;charset=utf-8.

When decoded it has two key value pairs.

1. f.req — This contains an envelope approach to encapsulate multiple RPC requests. In this example we have only one RPC request in the innermost array. The first element is the rpcid (as shown in the figure 2 ‘otAQ7b’) we have seen in the request query parameter, the second element is the actual payload to be executed, and the last element is the order in which the payload will be processed.

2. at: It is probably some XSRF mitigation parameter. It has seen to be observed as a static value in all the bard.google requests.

In the response we have seen the rpcid again and the response that we received from executing the request payload. Also in the response the length of the payload was preceding the actual payload.

Fig 3: batchexecute response payload.

streamgenerate

The chat traffic comes after two initial batchexecute request-responses.

Fig 4: streamgenerate request query

The request query is like the batchexecute example we have observed. The differences are -
1. The rpcid is not observed in the request.
2. The URL path is different and seen as /_/BardChatUi/data/assistant.lamda.BardFrontendService/StreamGenerate

Let’s look at the request body -

Fig 5: streamgenerate request body

The request body is of type application/x-www-form-urlencoded;charset=utf-8 like previous example. The form data also holds similar pattern. We can observe the chat question is visible in the f.req and the at contains the earlier static XSRF mitigation token.

The processed answer for the requests is seen on the response body.

Fig 6: streamgenerate response body

The structure of the body is similar to the earlier batchexecute response as well. The actual response array is preceded by the length of the response. We can observe three separate responses for each of the questions (as also seen in the UI draft answers.). Each of the responses is accompanied by some ids suggesting the responses are uniquely identifiable.

Analytics and logging

As with any other Google service, analytics and logger traffic was observed for Bard also. Let’s look the hosts we have observed:

1. play.google.com — We have observed this host present in two separate streams using TLS 1.3 and QUIC Version-1 respectively. Leading DNS record was observed for this host.

Fig 7: Data logging

During our observation, we noticed periodic logging with this host. After each chat request the log requests were seen. The POST requests contained an array of values, likely related to front-end performance logs, and the response also returned an array of values.

Fig 8: Response of play logs.

2. www.google-analytics.com — This host was observed to be using QUIC Version-1 as transport Layer. We have found multiple requests during the chat session. The response was observed to be 204 (No Content Success).

3. myactivity.google.com — This is a centralized location to view user activities across Google services. The host is using TLS 1.3 as transport layer. A leading DNS request was observed for this host.

4. www.googletagmanager.com — This is an analytics tool to deploy and manage marketing and analysis tag on the web application. The host is using TLS 1.3 as transport layer. A leading DNS request was observed for this host.

Originally published at https://www.keysight.com.

--

--

Rakesh seal
Keysight ATI

eat — sleep — make the world a better place. :)