A Guide to Gotchas with LangChain Document Loaders in a Chrome Extension
This is the sixth article in a series of articles about Lumos, a RAG LLM co-pilot for browsing the web. Reading the prior articles is recommended!
A Perfect Storm 🌩️
This article is a guide for integrating LangChain document loaders into a Chrome extension. At first look, this seems like a straightforward task, but it turns out, it was a bit more complicated than I thought. LangChain provides document loaders that run in Node.js and browser environments, but a Chrome extension’s service worker runtime is neither. A few gotchas and workarounds are involved, so I thought I’d share how it was all done. Admittedly, I’m not an expert in JavaScript or the ecosystem, so forgive me if some of what I did is suboptimal. In those instances, please share your feedback!
If you’ve somehow found yourself in the perfect storm of building a Chrome extension with React, TypeScript, webpack, and LangChain, read on!
Attaching Files to Lumos 🪄📎
“Chatting” with a document is the latest feature in Lumos.
The implementation uses LangChain document loaders to parse the contents of a file and pass them to Lumos’s online, in-memory RAG workflow. Using the existing workflow was the main, self-imposed constraint in the design, but the following were also secondary considerations:
- Scalability: The design should scale to any document loader (i.e. any file extension).
- Usability: End users shouldn’t have to specify the desired document loader for a given file extension.
- Performance: Document loading should occur in the Chrome extension’s service worker runtime.
- Build Efficiency: As much as possible, the packaged extension code should not be bloated.
In the end, not all goals were achieved and compromises were made, but the final functionality still delivers. Chat with your PDF, in the browser, using local LLMs.
DynamicFileLoader ⚡
To achieve the scalability and usability goals, the implementation must infer the desired document loader to use given a particular file. LangChain provides a DirectoryLoader
class with similar functionality, but in the context of a Chrome extension, the local file system is not easily accessible. A file must be uploaded through an <input>
component in the extension’s popup, which is then serialized to a File
object.
Unfortunately, DirectoryLoader
only accepts a local directory path (string
), not a File
. Nevertheless, it’s quite easy to create a custom document loader with basically the same functionality. The implementation of Lumos’s DynamicFileLoader
is mostly copied from DirectoryLoader
.
import { Document } from "@langchain/core/documents";
import { BaseDocumentLoader } from "langchain/document_loaders/base";
import { TextLoader } from "langchain/document_loaders/fs/text";
import { getExtension } from "./util";
export interface LoadersMapping {
[extension: string]: (file: File) => BaseDocumentLoader;
}
export class DynamicFileLoader extends BaseDocumentLoader {
constructor(
public file: File,
public loaders: LoadersMapping,
) {
...
}
public async load(): Promise<Document[]> {
const documents: Document[] = [];
const extension = getExtension(this.file.name, true);
let loader;
if (extension !== "" && extension in this.loaders) {
const loaderFactory = this.loaders[extension];
loader = loaderFactory(this.file);
} else {
// default to using text loader
loader = new TextLoader(this.file);
}
documents.push(...(await loader.load()));
return documents;
}
}
DynamicFileLoader
accepts a File
object instead of a directory path and defaults to using LangChain’s TextLoader
if a given file extension does not have a matching document loader based on the provided configuration. DynamicFileLoader
is initialized with a mapping of extension type to document loader.
import { JSONLoader } from "langchain/document_loaders/fs/json";
import { CSVPackedLoader } from "../document_loaders/csv";
import { DynamicFileLoader } from "../document_loaders/dynamic_file";
const loader = new DynamicFileLoader(file, {
// add more loaders here
".csv": (file) => new CSVPackedLoader(file),
".json": (file) => new JSONLoader(file),
});
Document loaders can be added (or removed) to the LoadersMapping
configuration and end users never need to specify the mapping. The loading mechanism is possible because the individual document loaders already accept Blob
objects (File
is a subclass of Blob
). The number of extension types is fairly limited, which makes this approach scaleable for the foreseeable future.
Package All Dependencies 📦
LangChain document loaders use dynamic importing, which helps application efficiency, but for a webpacked application with code running in an extension’s service worker, this will not work. The service worker runtime does not have dependencies available to it unless they’re explicitly imported and packaged.
The following error occurs when running LangChain’s CSVLoader
from the extension’s service worker: Error: Please install d3-dsv as a dependency with, e.g. `yarn add d3-dsv@2`
.
A workaround is to create a new document loader that does not use dynamic imports. This is the solution used in Lumos. To maintain the same constructor interface, CSVPackedLoader
is subclassed from CSVLoader
.
import { dsvFormat } from "d3-dsv";
import { CSVLoader } from "langchain/document_loaders/fs/csv";
export class CSVPackedLoader extends CSVLoader {
/**
* This function is copied from the CSVLoader class with a few
* modifications so that it's able to run in a Chrome extension
* context.
*/
public async parse(raw: string): Promise<string[]> {
const { column, separator = "," } = this.options;
// comment out dynamic import
// const { dsvFormat } = await CSVLoaderImports();
const psv = dsvFormat(separator);
...
}
}
Because different LangChain document loaders require different npm
packages (e.g. PPTXLoader
requires officeparser
), adding more document loaders to the application increases the number of packages required and therefore increases the size of the extension. Although undesirable, this side effect is not noticeable at runtime because service worker scripts are run in a separate thread from the main extension code. However, the impact is felt at build time (slower building).
Loading with unsafe-eval ⛔
Since the introduction of Manifest V3, executing remote code is no longer allowed in Chrome extensions (with some exceptions). This change is critical for making extensions more secure. Extensions that rely on remote code execution (e.g. executeScript()
, eval()
, and new Function()
) now have the burden of migrating to a V3 compatible implementation.
The following error (or similar) occurs during runtime if a Chrome extension executes disallowed code: EvalError: Refused to evaluate a string as JavaScript because 'unsafe-eval' is not allowed
.
Coincidentally, the CSVLoader
implementation contains one of these functions. Specifically, the parse()
function contains a call to d3-dsv
's DSV.parse()
, whose JSDoc specifically mentions requiring unsafe-eval content security policy.
The implementation of parse()
in CSVPackedLoader
overrides the parent implementation so that unsafe-eval content security policy is not required. The following code demonstrates how to migrate DSV.parse()
to DSV.parseRows()
. The downstream implementation of the function diverges from that of the parent (CSVLoader.parse()
).
public async parse(raw: string): Promise<string[]> {
const { column, separator = "," } = this.options;
const psv = dsvFormat(separator);
// cannot use psv.parse(), unsafe-eval is not allowed
let parsed = psv.parseRows(raw.trim());
if (column !== undefined) {
if (!parsed[0].includes(column)) {
throw new Error(`Column ${column} not found in CSV file.`);
}
// get index of column
const columnIndex = parsed[0].indexOf(column);
// Note TextLoader will raise an exception if the value is null.
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
return parsed.map((row) => row[columnIndex]!);
}
// parsed = [["foo", "bar"], ["1", "2"], ["3", "4"]]
// output strings = ["foo: 1\nbar: 2", "foo: 3\nbar: 4"]
// get first element of parsed
const headers = parsed[0];
parsed = parsed.slice(1);
return parsed.map((row) =>
row.map((key, index) => `${headers[index]}: ${key}`).join("\n"),
);
}
In cases where a document loader’s dependency contains remote executable code, a subclassed implementation may be required. Otherwise, there may be alternative solutions where code is executed in a sandboxed iframe and passed back to the main application.
Loading with DOM/Browser APIs 🖥️
By default, the Chrome extension service worker runtime does not have access to DOM/browser APIs such as document
and window
. Unluckily, the core dependency of LangChain’s WebPDFLoader
, PDF.js (via pdf-parse
), uses these APIs (for some reason 😑) to parse PDF files. This makes it impossible to run WebPDFLoader
in an extension’s service worker script without a polyfill implementation to make DOM/browser APIs available.
Days later…
After having spent too much time wrangling packages and messing with Lumos’s webpack configuration, I conceded to have WebPDFLoader
run in the extension popup instead of the service worker. The solution is not ideal. However, it still leverages the same workflow as other document loaders running in the background. The text content of a PDF is parsed and passed back to the service worker as a plain text document and parsed again by a TextLoader
. Notably, this approach avoids creating another subclass and creates a path for supporting other document loaders that depend on DOM/browser APIs.
Resolve Fallbacks and Externals in Webpack 🎒
Lastly, to get all the document loaders and dependencies packaged and running as expected, I had to modify Lumos’s webpack configuration to exclude polyfills (i.e. resolve fallbacks) and define externals for the loaders’ dependencies. Note: webpack 5 does not include Node.js core modules by default.
When building a webpacked application, you may encounter the following error (or similar):
ERROR in ./node_modules/pdf-parse/lib/pdf.js/v1.10.100/build/pdf.js 17381:10-24
Module not found: Error: Can't resolve 'url' in '/Users/andrewnguonly/Documents/workspace/github/andrewnguonly/Lumos/node_modules/pdf-parse/lib/pdf.js/v1.10.100/build'
BREAKING CHANGE: webpack < 5 used to include polyfills for node.js core modules by default.
This is no longer the case. Verify if you need this module and configure a polyfill for it.
If you want to include a polyfill, you need to:
- add a fallback 'resolve.fallback: { "url": require.resolve("url/") }'
- install 'url'
If you don't want to include a polyfill, you can use an empty module like this:
resolve.fallback: { "url": false }
If so, you might be able to exclude polyfills for a given document loader’s dependencies. This should be tested on a case-by-case basis. The following is a snippet of Lumos’s webpack.config.js
file. Each resolve fallback and external is commented with the associated LangChain document loader.
{
...
resolve: {
extensions: [".tsx", ".ts", ".js"],
fallback: {
// enable use of LangChain document loaders
fs: false, // TextLoader
zlib: false, // WebPDFLoader
http: false, // WebPDFLoader
https: false, // WebPDFLoader
url: false, // WebPDFLoader
}
},
output: {
path: path.join(__dirname, "dist/js"),
filename: "[name].js",
},
externals: {
// enable use of LangChain document loaders
"node:fs/promises": "commonjs2 node:fs/promises", // TextLoader
},
};
Extra Credit 😄
In addition to leveraging the existing RAG workflow in Lumos, the new file attachment feature also supports attaching images (jpeg
, jpg
, png
) without interrupting the use of document loaders. If you’re interested in seeing how files can be transferred between a Chrome extension popup and a service worker via message passing, take a look at the source code!
Hint: base64 encode
Over time, I expect to add more document loaders to Lumos as the needs arise. And thinking ahead, I suspect the next big jump may be for parsing audio and video, embedded or attached.
References
- Lumos (GitHub)
- Local LLM in the Browser Powered by Ollama (Part 1)
- Local LLM in the Browser Powered by Ollama (Part 2)
- Let’s Normalize Online, In-Memory RAG! (Part 3)
- Supercharging If-Statements With Prompt Classification Using Ollama and LangChain (Part 4)
- Bolstering LangChain’s MemoryVectorStore With Keyword Search (Part 5)