Using Cosmos DB Bindings to save cookies.

Part 3 of my Serverless Instagram Bot using Docker and Azure Functions series. I am learning to use Cosmos DB to save and retrieve browser state.

Photo by NASA on Unsplash

Persisting Sessions🚨

In previous parts, I had to setup a headless browser to automate commenting and liking actions on www.instagram.com for me. So far, I have been able to login to my account.

The Problem

The Browserless Docker container is running in an Azure VM in UK South. I clicked “This Was Me” on my PC browser where I was previously logged in. The browser page did refresh and landed back on my logged in profile. I believe that should have authorized that location ( UK South / London, United Kingdom ) for my account. I executed my Azure HttpTrigger Function again and still got the ‘Suspicious Login Attempt” page blocking me. I figured it was probably because there was no state or storage or cookies persisting between HTTP triggered requests. Solving this meant using a cloud database provider to save things like cookies and local storage data when a login is successful and retrieve it when a request is made so it can be loaded back into the browser instance. I had already been looking at CosmoDB.

Enter Cosmos DB

I had been reading up the Azure Functions Cosmos DB docs and also the Triggers and Binding docs. After creating a Cosmos DB by following the official guide, I got stuck grokking the connectionStringSetting . The UI and the docs had diverged significantly and I tried following the directions but got lost. I finally figured it out a bit after days of lazy hacking.
The idea is, use Cosmos DB input and output binding to retrieve and set state data ( Local and Session storage, Cookies) for each request.

Cosmos DB Input and Output Binding: Lookup by ID in query

There’s a bit of configuration that goes into using Cosmos DB bindings. To configure bindings, I had to edit the function.json file for the HTTP Trigger function. The basic binding configuration I took from the Azure guide

[
...
{
"name": "pageCookiesIn","type": "cosmosDB","databaseName": "Sessions","collectionName": "Cookies","createIfNotExists": true,"connectionStringSetting": "COSMOS_ONE","direction": "in"},{"name": "pageCookiesOut","type": "cosmosDB","databaseName": "Sessions","collectionName": "Cookies","createIfNotExists": true,"connectionStringSetting": "COSMOS_ONE","direction": "out"},...
]

I had to wrap my code in try / catch function blocks because the Puppeteer API page.waitForNavigation, page.waitForSelector , I was using would throw an Error and crash the app if things do not work as expected.

Learn how to handle errors in your Azure function here.

Setting up Connection Strings and Firewall

In function.json , I had to configure the connectionStringSetting property. First I needed the Cosmos DB keys. Then after getting the keys / value, I’d update local.settings.json with it and also my Function -> Application Settings with it also.

In my Azure Cloud Shell, I had to run this command to get my connection string

az cosmosdb list-connection-strings — name < Cosmos DB account name > -— resource-group <resource group name>.

az cosmosdb list-connection-strings — name < Cosmos DB account name > -— resource-group <resource group name>
local.settings.json

After completing this, my bindings still failed to write a new Cosmos DB document. I had an error which I fixed by disabling the firewall on my Cosmos DB account.

In production, Id probably not handle it this way.

Handling the state data

So far, after these configurations and the code, I was able to write out the cookies into my Cosmos DB collection when I triggered the function ( press F5 in VS Code ). If you have been following and you have the code from github, you can execute your function and you should have some cookies from the Instagram session created on the Browserless Puppeteer instance saved on your Cosmos DB collection Sessions/Cookies .

Using the IG account username as a doc ID.

Using JSON Payloads in Binding Expressions to query documents

When a trigger payload is JSON, you can refer to its properties in configuration for other bindings in the same function and in function code.

The POST request contains a JSON payload { username: XXXX, password: XXXX}. Here is an example of the HTTP request

POST http://localhost:7071/api/HttpTrigger HTTP/1.1content-type: application/json{"username": "mXXXng","password": "XXXX"}

http://localhost:7071/api/HttpTrigger is my local development endpoint when I run my Azure function on VS Code. I could also use Postman to send HTTP request or CURL. I use the above code with a VS Code extension called REST Client.

According to the Azure docs on Binding Expression Patterns, I can refer to username in my function.json file as {username} . Time to try it out.

Partition key value must be supplied for this operation. OUCH 🐼

I spent hours figuring out that error. I looked up StackOverflow. Re-ran the code and it was still going through all the way to logging in and saving the cookies to Cosmo DB. This error still came up. At one point I had to delete the database from the Azure Cloud Shell. When I run the function, I do not get the document. pageCookiesIn and context.bindings.pageCookiesIn are both undefined .

module.exports = async function(context, req, pageCookiesIn) {context.log("pageCookiesIn"); // prints undefined....

pageCookiesIn should contain the document retrieved from the Cosmos DB database.

🔨How I fixed the “Partition key value must be supplied for this operation” error.

I removed the id property in my binding configuration function.json file and it returned an Array . Because the document is stored in a partitionKey , dynamically set by using the binding expression {username} , I can retrieve all the documents per {username} in that partition / collection. Makes me think about putting all the state data in one username partition. Someone needs to tell me if I am doing this right 😅. The docs do say partitionKey stuff is for scalability.

💡Think.ing Strategies, Automation Begins

Here’s my code now.

Line 27 to 31 above checks if a document pageCookiesIn was found. The Cosmos DB binding in the funciton.json file should populate that and pass it as the third parameter pageCookiesIn after context, req, ... in the function. On line 30, we set the found cookies into the page.

💭So now I am thinking of implementing my automated actions as strategies. 💭Functions that are executed depending on the page content.
💭Maybe they’ll pass the page to another strategy and
💭maybe a strategy can repeatedly call itself.

🌖To be concluded…

So far, I’ve been able to login my own Instagram profile though I still got a few steps to go. I will complete this app in another post. I plan to —

  1. Create more stratagies that will perform actual automated actions.
  2. Save and Fetch LocalStorage data to Cosmos DB.
  3. Identify elements on the page like images and text and buttons.
  4. Carry out actions like — like posts and comment on posts, or reply.
  5. Make a video explaining the code in this post. 📹

👏Please clap. Please share this post. 👏

I appreciate claps if you ❤️ this post. Stay with me for Another DIY Javascript Experiment and dope coding topics including Machine Learning, React ecosystem, Linux and anything Javascript does.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store