Google GCP BigQuery CSV table encountered too many errors, giving up

Sylvia Sun
3 min readApr 2, 2020

--

I recently did an analysis about current Covid-19 situation around where I live, and used GCP BigQuery to facilitate the data pipeline. I encountered an error:

which took me a long time to debug, and hope this can help you understand better about the root cause of the error.

Before I figured out why, I tried to google the error messages, and it led me to two articles with suggested solutions, and I tried both of them, unfortunately it doesn’t work for me. I’ll provide links to those two solutions for readers’ easy reference at the end of the article.

Instead, after some investigation by myself, I realized that every time after loading a CSV file from GCS or file upload, the BigQuery table adds a new field “int64_field_0” automatically. The reason I face this problem is because I use a dataset generated with pandas default options, without excluding the index column. Therefore, the dataset generated contains a column without any name, which is basically the index you see in Pandas. With Pandas read_csv, by default it will treat the first unnamed column as index, however, BigQuery won’t exclude it but treat it as a new field, as shown in the snapshot below:

If you don’t include this field in your schema, the data transfer will always throw an error. If you use the web UI and upload or transfer the data, and select “auto detect schema”, this field will be added into the table created. If you are not aware of this field, and try to append data to this auto-generated dataset without specifically note this field, the data transfer won’t pass. So ultimately, the solution for me was to include the following line in the schema config in the script.

Going forward, Google might update the backend, so this can be a less headache for the user, however, I hope this short article can give you a direction for debugging.

Below are the two alternatives I found if the above doesn’t work that you can try:

--

--

Sylvia Sun

I have spent my career analyzing different kinds of data, from financial to text, and I’m fascinated about new technologies that can be used in data analysis.