So you want to expand your selection?

Nicholas Dudfield
3 min readFeb 8, 2024

--

Inside of a browser, up to a model’s token count. Say you set the model’s limit to 2500 tokens, and you have a counter that is shown:

The expansion’s counting of tokens should match this, which incidentally matches the amount of tokens counted if the user used the copy command . Say they want to paste the string into a chat app, or into some string in code. If they used the same class of tokeniser they’ll get the same count.

The token counter is accurate, counting the right tokens, and so we must match it.

On the face of it, it sounds kind of trivial, no? Well, not so fast, range.toString() and selection.toString() have different behaviours. The former will return any text in the range, while the selection method will return only visible text! (think css etc)

Whatever, that’s easy!

“So?”, I hear you ask, incredulous at the stupidity, “just just loop over the segments you’ve created, setting the selection to each segment range, and then use selection.toString() and keep a tally”

Right, to filter out the invisible text inside the ranges, which would mess up the token count, which is based on selection.toString()?

Ok, but if you do that it’s going to be quite “gluggy”, as there could be thousands of lines/segments, and then your crappy little app is going to be even less well received. Also, it doesn’t even seem to work.

Simply/naively expecting a string split up into parts, and counting the tokens in each part, to match the count of the string, is a recipe for disappointment. Consider: encode(‘word’) will encode to 1 token, but encode(‘wo’) and encode(‘rd’) will encode to 2 tokens.

Convergence on a solution

Ok, so maybe we can do some bisect thingy to slowly converge on the right selection? Hopefully reducing the amount of redundant token counts of the selection.toString(). We don’t, after all, want to count the full tokens each time we add a segment in an effort to get closer.

Unconvinced? What, do you wanna faff around trying to determine which text is visible or not?? getComputedStyle?? Future Firefox? Future Safari?? You want to maintain that?? To be fair — having not gone down that path, we don’t know what we don’t know

Let’s just use bisect/convergence, as it’s “relatively” simple. Besides, the background script is doing the tokenising, so less page glug.

Cool, we’re agreed (hopefully)

Just one selection

So where does your expansion actually start from? If the user is in the middle of the page, and the model has a short context, they aren’t going to want it to start from the top.

So you need to start at what’s visible. Recall that at the end we want to end up with ONE selection. But what about fixed headers!?

They will be “visible”, but if you include them and simply create a final selection that starts with that and your end lines, then your selection will start off screen, and, quite possibly end off screen as well (above the viewport). What the hell just happened? This things sucks

So we must find the first visible lines on the screen. But recall, we are wanting to expand to ONE selection. On screen there may be rendered separate parts of the document that aren’t actually joined in document order, with elements that are offscreen in-between.

So we need to find the longest chain of adjacent nodes that start on the screen, and then expand downwards from there!

Trivial, right ? Well, not quite …

Not rocket surgery, but also not document.expandVisibleSelection(2500)

--

--