The Unreasonable Effectiveness of Stored Procedures

Konstantin Surkov
The Startup
Published in
5 min readMay 3, 2020

--

From time to time I see database design that does not use stored procedures. That always makes me sad, and that is the main reason for this article.

Of course, “does not use” is an extreme condition, more common case is “using stored procedures in a manner that does not bring all benefits they can provide”. Sad nevertheless, but can be helped.

This article is going to be very practical. It is not about “academic purity” or some such. For every provided reason why you should use stored procedures, I will also show how business will benefit from it.

Every time I say “you” in this article I mean “you, software developer” and “you, database architect”.

Time-to-market and development costs

These are two different metrics, but highly correlated. They both also correlate with the quality of your design. From the design point of view, stored procedures play several important roles.

Separation of concerns

Systems that use databases can be thought of as having two different domains. One domain is your application code (Python, Go, what have you). Another domain is database code (procedures, functions, triggers, tables, foreign keys, etc.)

In heterogeneous systems (in this case, systems that use more than one implementation language), you don’t mix together different languages. Here’s your Python code, and here’s your Go code, residing in separate projects and, most likely, separate repositories. This is good industry practice and just common sense. For some reason, when it comes to SQL, some people think it is suddenly okay to mix-in SQL statements right into your application code. But it’s not. It’s just as weird as mixing together Go and Python.

Application code intertwined with SQL statements

Putting all SQL code in separate project provides high cohesion inside that project and low coupling with projects from other domains. Both are good things.

Clear interface

Stored procedures provide clear interface, or set of interfaces, between different domains. They take your design from the state “one system can reach into the guts of the other system and do whatever it wants” to the state “systems interact only through well-defined interfaces”. Again, it is well-established practice and even common place in general system design, but mysteriously often forgotten when it comes to SQL.

Well-defined interface

Code organization and reuse

Stored procedure, after all, is not much more than a named batch of SQL code. One or more SQL statements bundled together and given a name. You give your methods names, don’t you? That helps you to organize your code better, and it also allows you to reuse them without code duplication. Exactly the same reasoning works for stored procedures.

Maintenance and development cost

While previous chapter was about benefits you would get while developing brand new system (version 1.0) or developing new set of features, unrelated or almost unrelated to existing codebase, this chapter deals with fixing bugs and refactoring. As many of you know, refactoring is something you always have to do, at least to some degree, when changing implementation of existing features or adding new features that are tightly coupled with existing ones.

During both refactoring and fixing bugs it is often necessary to find all places in code that deal with given database object (for example, data table). In other words, find all functional dependencies.

If all access to your data goes through stored procedures, it is as simple as running

select … from sys.dm_sql_referencing_entities(your_object)

If your code is a set of SQL statements interspersed with application code, good luck finding all code that touches this particular table, one way or another. If, god forbid, you use dynamic SQL generation, even luck is not likely to help you. Operational word here: discoverability.

Security

If all data access goes through stored procedures, it is easy to define database user that will not have any direct access to tables, only to stored procedures, all or some well-defined subset of them. Application that is logged under that user will not be able to do anything else but to call stored procedures it is permitted to call. How does it benefits business? It adds another layer of protection from unauthorized access, one that is easy to understand and manage. Loss of business reputation because of unauthorized data access is one too common these days, and directly impacts bottom line when customer decides to switch to competitor because of that.

Data consistency

Often data operations need to be atomic. Atomic in this case means that operation should be either fully completed or completely rolled back without leaving any trace. There shouldn’t be any “partially completed” operations.

Atomic operation on atoll Bikini, 1954

If all your SQL code is in stored procedures anyway, it is as easy as using this template:

begin try — begin tran — <many lines of your atomic code> — commit — end try — begin catch — rollback — end catch

If you are trying to manage transactions from your application code, you will sooner or later have the situation in production when application died after making data change inside transaction but before it had a chance to commit or roll back. Or maybe it did not die, but just lost database connectivity for a few minutes. Depending on what you have changed and what locking policy you have, you may have ended up with only a few rows exclusively locked for those few minutes, blocking reads that tried to read those rows, or with the whole table exclusively locked, blocking any reads from that table.

In other words, using explicit transactions only in-process, e.g. within stored procedures, will save you from some operational pain and customer from the situation when half of your system suddenly freezes for a few minutes.

Performance

Stored procedures offer slightly better performance than SQL batch due to the fact that stored procedures are executed via RPC and therefore parameters are passed on the wire in native/binary format. Also, obviously, when you call stored procedure, you only need to pass parameters, not the body of the procedure itself.

SQL batch, on the other hand, passes all parameters as strings, which is not as efficient. Also, obviously, SQL batch need to pass whole SQL body every time, parse it, and find the compiled version of every statement in procedure cache (earlier versions of Sql Server would go even as far as to only compile stored procedures, not batches, hence the name “procedure cache”).

For stored procedures that are fast, but executed often, the overhead of sending and parsing of SQL batch body and it’s parameters can reach tens or even hundreds percents, comparing to direct stored procedure execution.

In conclusion, given all the advantages stored procedures provide, only simplest POC-level implementations should be allowed to do without using them. The bigger your system grows, the more it will benefit from stored procedures, and the earlier you start using stored procedures in your system’s life cycle, the more benefits described above you will enjoy.

--

--