Convert files to PDF using Microsoft Graph & Azure Functions
I do come across the task of creating PDF documents quite frequently and while there are a ton of open source and proprietary PDF libraries for almost any platform I was asking myself: Isn’t there a simpler way of doing this? Without caring about the drawbacks of the free libs or the cost and licensing complexity of the proprietary ones? Do I really want to care about all the settings and paging challenges?
In the age of serverless there has to be a better way! After some research I found out that this is basically a built in feature of OneDrive and Sharepoint Online as part of Office 365. Now I needed an API to actually use this as part of my app and there comes Microsoft Graph to the rescue!
“Microsoft Graph provides a unified programmability model that you can use to build apps for organizations and consumers that interact with the data of millions of users. You can use the Microsoft Graph REST APIs to access data in Azure Active Directory, Office 365 services, Enterprise Mobility and Security services, Windows 10 services, Dynamics 365, and more. Explore our documentation to learn more about how to use Microsoft Graph APIs.”
The beta version of the Microsoft Graph supports converting files through the DriveItem resource in the formats of doc, docx, epub, eml, htm, html, md, msg, odp, ods, odt, pps, ppsx, ppt, pptx, rtf, tif, tiff, xls, xlsm, xlsx to PDF (You can also convert pretty much any image format to jpg btw.).
More here: https://docs.microsoft.com/en-us/graph/api/driveitem-get-content-format?view=graph-rest-beta&tabs=http
In order to use the DriveItem resource you need either OneDrive or SharePoint Online as part of Office 365.
So let’s build a serverless app to give us an easy restful endpoint to convert files to PDF using the Microsoft Graph in the background. I will do this with C# and Azure Functions but of course this will work with any language and framework as long as it’s able to make http requests.
We will have to perform the following steps create this solution:
- Create a new Azure Functions app
- Create an OAuth2 authentication service to request an access token to call the Microsoft Graph
- Create a File Service to upload, convert and delete files using the Microsoft Graph
- Setup Dependency Injection
- Create a new function
- Upload the input file to a drive
- Download the file in pdf format
- Delete the input file from the drive
You can find the full code sample here: https://github.com/GrillPhil/ServerlessPDFConversionDemo
So let’s get started!
- Microsoft Visual Studio 2019 with Azure Functions Tools
Download free as Community Edition here: https://visualstudio.microsoft.com/vs/community/
- Office 365 tenant
Microsoft provides free Office 365 developer tenants when you join the Office 365 developer program here: https://developer.microsoft.com/en-us/office/dev-program
- Azure subscription to deploy the Functions app to
Sign up for a free Azure subscription here: https://azure.microsoft.com/en-us/free/
Step 1: Create a new Azure Functions app
Open Visual Studio and create new Azure Function project. Make sure Azure Functions v3 (.NET Core) is selected and pick the Empty template.
Step 2: Create an OAuth2 authentication service to request an access token to call the Microsoft Graph
Setup authentication with credential flow in Azure Active Directory
Before we can talk to the Microsoft Graph we need to authenticate and acquire an access token using OAuth 2 credential flow. In our case we have no signed in user present instead we want to authenticate as a service. Therefore we will use the credential flow providing a ClientId and ClientSecret of an app registration in AzureAD.
So let’s start by creating the app registration in Azure AD. Go to https://portal.azure.com and sign in with a admin user of your O365 tenant, go to Azure Active Directory and select App Registrations
Add a new app registration, pick a name e.g. PdfConversionService. Account types and Redirect URI are not relevant for this authentication mode, so you can just leave those as they are.
Next we need to save the values of Application (client) Id and Directory (tenant) Id for later use in our function’s settings from here:
Next we need to create a Client Secret as credential for our service to authenticate. Therefore select Certificates & Secrets in the left menu and click the New Client Secret button. Pick a description and Expiration:
Also save the Client Secret for later use in our function’s settings.
Finally we need to add permission to read and write in our SharePoint for the app registration. Select API permissions and click the Add a permission button. Choose Microsoft Graph and pick Application permission. Application permission means that we act in the name of an application instead on behalt of a user (delegated permission). Search for file and select Files.ReadWrite.All:
After adding the permission we also need to grant this permission to the app registration. The button here for might take a moment to become active. When active click it and enter your credentials in the popup window:
Now we have completed the app registration configuration.
Function app settings for authentication
To use the credentials we have just created along with some additional authentication configuration let’s create a model for authentication options:
And of course we need to provide these options in our local.settings.json for local debugging:
The authentication service will use those settings to request an access token that we can pass along with the request to the Microsoft Graph later. This can be done by a simple http request:
Step 3: Create a File Service to upload, convert and delete files using the Microsoft Graph
Function app settings for the file service
For our file operations we need to provide 2 options to a file service:
- the Graph Endpoint to use and
- the SiteId to upload and convert to files to
So like with authentication let’s start by defining an options model:
Creating a site and finding the site id in SharePoint
Before we can upload anything to SharePoint we need to create a new SharePoint site and get it’s id. So let’s create a new SharePoint site e.g. PDFDemo in the portal (https://portal.office.com → SharePoint).
Getting the id of a SharePoint site is actually a bit tricky as I have yet to find a place in SharePoint where it’s actually shown. My trick of getting it is using the Microsoft Graph Explorer (https://developer.microsoft.com/en-us/graph/graph-explorer), log in with the O365 admin account and calling this url: https://graph.microsoft.com/v1.0/sites/YOUR_SHAREPOINT_URL/:/sites/NAME_OF_SITE/?$select=id where YOUR_SHAREPOINT_URL is something like medialessondev.sharepoint.com. This will return a response like this:
File service settings
So let’s add the full value of the id along with the Graph endpoint to our local.settings.json:
Implement file service
The file service will provide 3 methods:
- Upload a file to SharePoint
- Convert the file to pdf
- Delete the original file in SharePoint
The file service will use the authentication service to acquire an access token a use it for all 3 requests to the Microsoft Graph. So let’s pass the authentication service and the options in through the constructor and write a helper method to create a HttpClient that uses the access token:
Let’s add a new method to upload a file to our SharePoint site:
- I’m using the library MediaTypeMap.core to get the correct file extension by the file’s content type
- It’s important to also specify the file’s content type in the request header
- We just need to return the id of the saved file in SharePoint in order to work with this file in the following methods
Next we need to implement a method to convert this file to pdf and retrieve it from SharePoint. This can be done in a single Graph request providing the path, fileId and target format:
Finally we also need a method to delete the original file in SharePoint once we have successfully converted it:
Step 4: Setup Dependency Injection
In order to use the 2 services we just created along with their options we need to create them somewhere. This is where dependency injection comes in handy. To use dependency injection in Azure Function app we need to add the package Microsoft.Azure.Functions.Extensions to our app using Nuget.
With this package we can now configure our services and options in a startup class:
Step 5: Create a new function
Add a new function to your project and name it ConvertToPdf. Select the Http trigger so our function can be called via a http request and pick Authorization level Anonymous so we don’t need to provide any credentials when calling this function.
In order to use dependency injection in this function the class and method need to be NON static. Also we need to inject the file service and the pdf options into the constructor and save them into readonly fields:
Step 6: Upload the input file to a drive
With all services and settings and plumbing in place we can upload the file we receive in the function method using our file service:
Please note how the path variable is composed to point to the root folder of the SharePoint site.
Step 7: Download the file in pdf format
Next step is to download the pdf converted version of the file we just uploaded using the file id:
Step 8: Delete the original file from the drive and return the pdf
Almost there! Before returning the converted pdf file to the caller of the function let’s clean up by deleting the original file from SharePoint:
We made it!
This was a long post but I hope you still enjoyed it. I think this is a pretty convenient way of converting files to pdf in case you are in a scenario where you already have Office 365. The Azure AD authentication part is some work but can be reused in many ways when interaction with the Microsoft Graph. Doing this in Azure Functions lets me provide this functionality as a micro service to existing business processes and is free for up to 1 million requests per month.