A Guide to Gotchas with LangChain Document Loaders in a Chrome Extension

Andrew Nguonly
8 min readApr 2, 2024

This is the sixth article in a series of articles about Lumos, a RAG LLM co-pilot for browsing the web. Reading the prior articles is recommended!

Chat GPT 4 prompt: “Create an image of a green parrot doing party tricks. The parrot is juggling multiple types of documents (pdf, csv, txt). Generate the image in the style of Where’s Waldo. Now update the image so there’s a storm (wind, rain, thunder, etc) in the background.”

A Perfect Storm 🌩️

This article is a guide for integrating LangChain document loaders into a Chrome extension. At first look, this seems like a straightforward task, but it turns out, it was a bit more complicated than I thought. LangChain provides document loaders that run in Node.js and browser environments, but a Chrome extension’s service worker runtime is neither. A few gotchas and workarounds are involved, so I thought I’d share how it was all done. Admittedly, I’m not an expert in JavaScript or the ecosystem, so forgive me if some of what I did is suboptimal. In those instances, please share your feedback!

If you’ve somehow found yourself in the perfect storm of building a Chrome extension with React, TypeScript, webpack, and LangChain, read on!

Attaching Files to Lumos 🪄📎

“Chatting” with a document is the latest feature in Lumos.

The implementation uses LangChain document loaders to parse the contents of a file and pass them to Lumos’s online, in-memory RAG workflow. Using the existing workflow was the main, self-imposed constraint in the design, but the following were also secondary considerations:

  • Scalability: The design should scale to any document loader (i.e. any file extension).
  • Usability: End users shouldn’t have to specify the desired document loader for a given file extension.
  • Performance: Document loading should occur in the Chrome extension’s service worker runtime.
  • Build Efficiency: As much as possible, the packaged extension code should not be bloated.

In the end, not all goals were achieved and compromises were made, but the final functionality still delivers. Chat with your PDF, in the browser, using local LLMs.

DynamicFileLoader ⚡

To achieve the scalability and usability goals, the implementation must infer the desired document loader to use given a particular file. LangChain provides a DirectoryLoader class with similar functionality, but in the context of a Chrome extension, the local file system is not easily accessible. A file must be uploaded through an <input> component in the extension’s popup, which is then serialized to a File object.

Unfortunately, DirectoryLoader only accepts a local directory path (string), not a File. Nevertheless, it’s quite easy to create a custom document loader with basically the same functionality. The implementation of Lumos’s DynamicFileLoader is mostly copied from DirectoryLoader.

import { Document } from "@langchain/core/documents";
import { BaseDocumentLoader } from "langchain/document_loaders/base";
import { TextLoader } from "langchain/document_loaders/fs/text";

import { getExtension } from "./util";

export interface LoadersMapping {
[extension: string]: (file: File) => BaseDocumentLoader;
}

export class DynamicFileLoader extends BaseDocumentLoader {
constructor(
public file: File,
public loaders: LoadersMapping,
) {
...
}

public async load(): Promise<Document[]> {
const documents: Document[] = [];
const extension = getExtension(this.file.name, true);
let loader;

if (extension !== "" && extension in this.loaders) {
const loaderFactory = this.loaders[extension];
loader = loaderFactory(this.file);
} else {
// default to using text loader
loader = new TextLoader(this.file);
}

documents.push(...(await loader.load()));
return documents;
}
}

DynamicFileLoader accepts a File object instead of a directory path and defaults to using LangChain’s TextLoader if a given file extension does not have a matching document loader based on the provided configuration. DynamicFileLoader is initialized with a mapping of extension type to document loader.

import { JSONLoader } from "langchain/document_loaders/fs/json";

import { CSVPackedLoader } from "../document_loaders/csv";
import { DynamicFileLoader } from "../document_loaders/dynamic_file";

const loader = new DynamicFileLoader(file, {
// add more loaders here
".csv": (file) => new CSVPackedLoader(file),
".json": (file) => new JSONLoader(file),
});

Document loaders can be added (or removed) to the LoadersMapping configuration and end users never need to specify the mapping. The loading mechanism is possible because the individual document loaders already accept Blob objects (File is a subclass of Blob). The number of extension types is fairly limited, which makes this approach scaleable for the foreseeable future.

Package All Dependencies 📦

LangChain document loaders use dynamic importing, which helps application efficiency, but for a webpacked application with code running in an extension’s service worker, this will not work. The service worker runtime does not have dependencies available to it unless they’re explicitly imported and packaged.

The following error occurs when running LangChain’s CSVLoader from the extension’s service worker: Error: Please install d3-dsv as a dependency with, e.g. `yarn add d3-dsv@2`.

Chrome extension inspector console

A workaround is to create a new document loader that does not use dynamic imports. This is the solution used in Lumos. To maintain the same constructor interface, CSVPackedLoader is subclassed from CSVLoader.

import { dsvFormat } from "d3-dsv";
import { CSVLoader } from "langchain/document_loaders/fs/csv";

export class CSVPackedLoader extends CSVLoader {
/**
* This function is copied from the CSVLoader class with a few
* modifications so that it's able to run in a Chrome extension
* context.
*/
public async parse(raw: string): Promise<string[]> {
const { column, separator = "," } = this.options;

// comment out dynamic import
// const { dsvFormat } = await CSVLoaderImports();
const psv = dsvFormat(separator);
...
}
}

Because different LangChain document loaders require different npm packages (e.g. PPTXLoader requires officeparser), adding more document loaders to the application increases the number of packages required and therefore increases the size of the extension. Although undesirable, this side effect is not noticeable at runtime because service worker scripts are run in a separate thread from the main extension code. However, the impact is felt at build time (slower building).

Loading with unsafe-eval ⛔

Since the introduction of Manifest V3, executing remote code is no longer allowed in Chrome extensions (with some exceptions). This change is critical for making extensions more secure. Extensions that rely on remote code execution (e.g. executeScript(), eval(), and new Function()) now have the burden of migrating to a V3 compatible implementation.

The following error (or similar) occurs during runtime if a Chrome extension executes disallowed code: EvalError: Refused to evaluate a string as JavaScript because 'unsafe-eval' is not allowed.

Chrome extension inspector console

Coincidentally, the CSVLoader implementation contains one of these functions. Specifically, the parse() function contains a call to d3-dsv's DSV.parse(), whose JSDoc specifically mentions requiring unsafe-eval content security policy.

The implementation of parse() in CSVPackedLoader overrides the parent implementation so that unsafe-eval content security policy is not required. The following code demonstrates how to migrate DSV.parse() to DSV.parseRows(). The downstream implementation of the function diverges from that of the parent (CSVLoader.parse()).

public async parse(raw: string): Promise<string[]> {
const { column, separator = "," } = this.options;

const psv = dsvFormat(separator);
// cannot use psv.parse(), unsafe-eval is not allowed
let parsed = psv.parseRows(raw.trim());

if (column !== undefined) {
if (!parsed[0].includes(column)) {
throw new Error(`Column ${column} not found in CSV file.`);
}
// get index of column
const columnIndex = parsed[0].indexOf(column);
// Note TextLoader will raise an exception if the value is null.
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
return parsed.map((row) => row[columnIndex]!);
}

// parsed = [["foo", "bar"], ["1", "2"], ["3", "4"]]
// output strings = ["foo: 1\nbar: 2", "foo: 3\nbar: 4"]

// get first element of parsed
const headers = parsed[0];
parsed = parsed.slice(1);

return parsed.map((row) =>
row.map((key, index) => `${headers[index]}: ${key}`).join("\n"),
);
}

In cases where a document loader’s dependency contains remote executable code, a subclassed implementation may be required. Otherwise, there may be alternative solutions where code is executed in a sandboxed iframe and passed back to the main application.

Loading with DOM/Browser APIs 🖥️

By default, the Chrome extension service worker runtime does not have access to DOM/browser APIs such as document and window. Unluckily, the core dependency of LangChain’s WebPDFLoader, PDF.js (via pdf-parse), uses these APIs (for some reason 😑) to parse PDF files. This makes it impossible to run WebPDFLoader in an extension’s service worker script without a polyfill implementation to make DOM/browser APIs available.

Days later…

After having spent too much time wrangling packages and messing with Lumos’s webpack configuration, I conceded to have WebPDFLoader run in the extension popup instead of the service worker. The solution is not ideal. However, it still leverages the same workflow as other document loaders running in the background. The text content of a PDF is parsed and passed back to the service worker as a plain text document and parsed again by a TextLoader. Notably, this approach avoids creating another subclass and creates a path for supporting other document loaders that depend on DOM/browser APIs.

Resolve Fallbacks and Externals in Webpack 🎒

Lastly, to get all the document loaders and dependencies packaged and running as expected, I had to modify Lumos’s webpack configuration to exclude polyfills (i.e. resolve fallbacks) and define externals for the loaders’ dependencies. Note: webpack 5 does not include Node.js core modules by default.

When building a webpacked application, you may encounter the following error (or similar):

ERROR in ./node_modules/pdf-parse/lib/pdf.js/v1.10.100/build/pdf.js 17381:10-24
Module not found: Error: Can't resolve 'url' in '/Users/andrewnguonly/Documents/workspace/github/andrewnguonly/Lumos/node_modules/pdf-parse/lib/pdf.js/v1.10.100/build'

BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.

If you want to include a polyfill, you need to:
- add a fallback 'resolve.fallback: { "url": require.resolve("url/") }'
- install 'url'
If you don't want to include a polyfill, you can use an empty module like this:
resolve.fallback: { "url": false }

If so, you might be able to exclude polyfills for a given document loader’s dependencies. This should be tested on a case-by-case basis. The following is a snippet of Lumos’s webpack.config.js file. Each resolve fallback and external is commented with the associated LangChain document loader.

{
...
resolve: {
extensions: [".tsx", ".ts", ".js"],
fallback: {
// enable use of LangChain document loaders
fs: false, // TextLoader
zlib: false, // WebPDFLoader
http: false, // WebPDFLoader
https: false, // WebPDFLoader
url: false, // WebPDFLoader
}
},
output: {
path: path.join(__dirname, "dist/js"),
filename: "[name].js",
},
externals: {
// enable use of LangChain document loaders
"node:fs/promises": "commonjs2 node:fs/promises", // TextLoader
},
};

Extra Credit 😄

In addition to leveraging the existing RAG workflow in Lumos, the new file attachment feature also supports attaching images (jpeg, jpg, png) without interrupting the use of document loaders. If you’re interested in seeing how files can be transferred between a Chrome extension popup and a service worker via message passing, take a look at the source code!

Hint: base64 encode

--

--