Pyspark: How to Modify a Nested Struct Field

In our adventures trying to build a data lake, we are using dynamically generated spark cluster to ingest some data from MongoDB, our production database, to BigQuery. In order to do that, we use PySpark data frames and since mongo doesn’t have schemas, we try to infer the schema from the data.
collection_schema = spark.read.format(“mongo”) \
.option(“database”, db) \…