Data safety and security on Notebook.ai just (quietly) got a whole lot better

Andrew Brown
Indent Labs
Published in
5 min readJan 18, 2018

--

When we say data safety and security, we’re talking about two very important issues that even the biggest companies struggle with every day:

  • Providing safety for the ideas you choose to store in Notebook.ai, so you can rest easy they’ll always be there when you want them
  • Providing security for those same ideas, so only those you explicitly authorize have access to them

I want to share with you a few features that we just released specifically around keeping your data safe and always available to you, whenever you want it, from wherever you want it.

These updates also solve the page speed issues we’ve had with the site’s recent growth, which I believe falls into the realm of data safety: if the site is too slow for you to use, your data is too slow to get to. To make sure you can always access your data, we’ve made the site tons faster — and it’ll only get faster from here.

Database improvements

Upgrading the database is an important part of scaling up, yet something that requires downtime to do. Obviously, this is something we want to be ahead of and handle before it becomes an issue, so that’s what we’re doing now. We weren’t hitting any limits (yet) on the old database, but now we’re future-proofed for quite some time — the last time we upgraded the database was about a year ago, and I expect this one to last even longer than that.

With the upgrade, we also get a lot of great features to ensure your data in Notebook.ai is constantly kept safe and sound.

Concurrent connections increased 6x

Whenever you load a page on Notebook.ai, we fetch your information from a database to show to you. On the old database, we could only handle 20 concurrent connections — which means if 20 other people loaded a page at the same time as you, you’d have to wait for one of them to relinquish their connection to the database before you could use yours; in other words, a “wait your turn” mentality. This led to massive slowdown during peak times.

It turns out 20 connections started to constantly saturate with around 200 people on the site at the same time, leading to more and more people waiting for their turn as people already in line were still waiting. I expect the new connection limit of 120 to handle over 1000 concurrent users — which we’re quite a long way away from.

Increased region availability

There are now effectively two databases kept in sync that you will automatically connect to based on which is closer to you: one in the United States, and one in Europe. Connecting to a closer database will mean faster responses and faster page load times.

Redundant backups

The database is now synced to multiple hard drives, automatically. In the event that the primary hard drive fails, the backup hard drive will automatically replace it. This will ensure minimal downtime and constant backups of the entire database always available.

Database rollbacks

In the event of a malicious intrusion or exploit that is capable of destroying any user data, we now have the capability to “roll back” the database to a previous point in time up to a week prior, recovering any lost data. Because this effectively deletes any data since that point in time, this functionality will only be used in dire emergencies, but ensures we’re protected against this kind of attack.

Additionally, we now also keep 50 additional database snapshots for recovery purposes, up from a previous 5 snapshots.

Better encryption

When not actively in use, the database is now encrypted with AES-256, block-level storage encryption, with keys managed by Amazon AWS. All backup database snapshots are also stored in an encrypted S3 bucket.

Increased visibility

In addition to all of the above, the new database also gives better visibility into its health and any potential problems. It displays recent slow queries to give an indication of which code most needs optimization, as well as gives statistics on how well the cache is being utilized and where it could be improved.

Code improvements

We’ve also optimized much of the code around the site to perform quicker and more efficiently. In the absolute worst case, we were seeing pages taking over two minutes to load — those pages now load in around two seconds, and that’s worst-case.

On average, page load times have sped up from several seconds to around 300 milliseconds, with the top 50% of requests taking less than 150 milliseconds, and the top five percentile of page loads averaging just 50 milliseconds. This improvement may not seem like much, but it certainly adds up when you’re on a creative spree and want to jump from page to page as fast as you can before the inspiration is gone.

Of course, this isn’t the end-all-be-all of improvements coming this way. As we continue to grow, we’ll continue to need to optimize and grow the site with the worlds it contains. However, by staying ahead of the curve and optimizing before absolutely necessary, I feel we’re in a very good spot to remain stable and maintain our lead as an amazing worldbuilding service for everyone around the world.

Happy worldbuilding!

Andrew

What is Indent Labs?

Indent Labs is, at its core, a collection of ambitious open-source natural language processing projects aimed squarely at technological breakthroughs and moonshots in the field of writing. Wouldn’t it be awesome if you could generate quality stories from outlines, or automatically outline a story? What about generating a story as you make decisions on behalf of a character? Or talking directly with your characters?

The first word processor showed up in the 60’s and revolutionized writing through technology. Isn’t it time for another shift forward?

--

--