As we all know, database access plays an integral part in product and software development. It is one of the most critical parts of the application as it needs to be concrete, or needs to be designed well to become more pluggable.
Because of this, the object relational mapping (ORM) comes into play. It is the common ground for most developers when it comes to data accessibility. An ORM library helps simplify and solidify the implementations, and at the same time, it helps speed-up the development. It is doing the complex process along the way as we are using it. But of course, we should know the “pros” and “cons” of using an ORM into our application.
Nowadays, in the C# world (or .NET as a whole), many ORM libraries are available in the market, both open-source and commercial ones. The most popular one is Entity Framework (a data modelling framework from Microsoft), also, the others of like NHibernate, Dapper (a micro-ORM from StackOverflow) and LLBLGen Pro are available for use.
With these, we as developers are responsible for choosing the best fit for our needs (development time, simplicity, efficiency, performance, etc).
Why use RepoDb?
It is a dynamic, lightweight, efficient and a high-performance hybrid-ORM library for .NET.
- The word “high-performance” refers to “how fast” this ORM converts the raw data into a class object, and transport the class object as an actual data in the database.
- The word “efficient” refers to “how well-managed” this ORM uses the computer memory when manipulating the objects all throughout the cycle of the process.
It is considered a “hybrid-ORM” as it provides both features of micro-ORM and full-ORM. It provides flexibility to the developers to simplify the “switchover” of when to use the “micro” and “full” operations during the development.
This library supports the full-features of micro-ORM as it can do the CRUD operations via “Raw SQL”. It also supports the specialized features of full-ORM, where it allows the developer to manipulate the data via method-based operations.
Below are sample codes for raw SQL:
And below are the equivalent method-based calls:
The other features of full-ORMs are also available on this library through method-based call (i.e: BatchQuery, BulkInsert, Delete, DeleteAll, Insert, InsertAll, Update, UpdateAll, Merge, MergeAll, etc).
The batch operations on this library are all packed-statements. Meaning, a single DbCommand object is used to execute multiple SQL statements in one go. The process is ACID; it uses an implicit transaction internally if the caller does not provide it. By default, it use 10 packed-statements in each batch. See sample code below.
The statement above create 2 Order records of the Customer 10045 via InsertAll packed-statements. If any of the batch failed, then all changes on this operation will be rolled-back.
These functionalities are also available to both MergeAll and UpdateAll operations.
On the other hand, the bulk-operations are using the “bulk-operation” provided by ADO.NET. By default, the library is using the SqlBulkCopy class to do the “bulk-insert” operations for SQL Server databases.
Although the bulk-operations are way much faster than the batch-operations, but it has its own drawbacks. The developers must know when and where to properly use the bulk-operations within the application.
Replicating Big Data via BulkInsert
One common scenario of using bulk-operation is to maximize the performance when replicating data from one database into another database. Let us say from Oracle database into SqlServer database, or vice versa. See sample below.
The script above retrieves 1 million rows from Oracle database table named Person where the Id is greater than 10000, and is paged by 1, and insert the data into a Person table of the SqlServer database.
Let say, a developer wish to replicate via classes; assumed a class named Customer has an equivalent table in both databases of Oracle and SQL Server. This can be done by the scripts below.
Please note, the scripts above were not tested.
Executing a Stored Procedure
To execute a stored procedure, the extended Execute methods of the IDbConnection object must be called. The type of command must also be set to CommandType.StoredProcedure during the calls.
List of Execute methods:
- ExecuteQuery — executes a SQL statement and converts the result into a list of .NET CLR Types.
- ExecuteNonQuery — executes a SQL statement and returns the number of rows affected.
- ExecuteReader — executes a SQL statement and returns an instance of DbDataReader.
- ExecuteScalar — executes a SQL statement and returns the first-column of the first-row.
- ExecuteQueryMultiple — executes a multiple packed SQL statements and allow the developers to control the extraction of the result via QueryMultipleExtractor class.
Let us say a stored procedure below named sp_get_customer_orders is present in the database.
Then, it can be called via the codes below.
Multiple Resultsets via MultipleQuery
This feature has been introduced to help the developers minimize the round-trip calls between the client applications and the databases. It allow the developers to execute multiple SQL statements in a single call.
There are 2 ways of doing this, via native ExecuteQueryMultiple method, and the other is via method-based QueryMultiple<T1, T2>.
The ExecuteQueryMultiple requires the caller to issue a multiple SQL Statements. See sample code below.
The calls can also be combined with the stored procedures.
The QueryMultiple<T1, T2> method works similarly as ExecuteQueryMultiple method, except that the developers are required to pass the generic .NET CLR Types during the call.
The developers are also required to pass the Linq-based argument for the query expressions.
This method returns a System.Tuple object. See sample code below.
Both the ExecuteQueryMultiple and QueryMultiple methods support until 7 Tuples.
Support to dynamic objects are very heavy on this library. Most operations of this library accept the objects dynamic and System.ExpandoObject as the arguments.
The sample codes below used the dynamic object as an expression.
It return all the customers from Spain where the LastName is Santiago. The type of the return value is IEnumerable<ExpandoObject>.
Also, the developers can utilize this feature if the intention is “not” to create the “model” classes when manipulating the data from the database. See sample code below.
Or update an existing Customer record.
The same goes to other operations like BulkInsert, Delete, Merge, DeleteAll, MergeAll, UpdateAll, etc.
Note: As of version 1.9.7, the library does not support ExpandoObject in BulkInsert, Insert, InsertAll, Merge, MergeAll, Update and UpdateAll operations.
The code below will work because the type of the entity variable is dynamic.
The code below will not work for Update operation because the return type of the Query operation is ExpandoObject. An exception will be thrown at the Update operation.
But the code below will work for Update operation.
The library has 2 built-in repositories named DbRepository and BaseRepository for data-layering purposes. Theses repositories can be abstracted and instantiated right-away by the developers.
This repository is used to create a higher-level repository that is shared to the entire models of the database.
The above repository uses both Customer and Order models of the database. It is shared among all other models.
To use it, use the code below.
The developers can also simplify the call by instantiating the DbRepository class directly.
This repository is used to create a model-specific repository. It is implemented as abstract and cannot be instantiated right-away.
Performance and Efficiency
This feature is the “heart” of the library. This is also the main reason why this library exists in the first place. Currently, the library has topped the recent “official-run” of RawDataAccessBencher of Frans Bouma, both the “performance” and the “memory-efficiency” tests.
A screenshot below is the result of “performance” test.
And the screenshot below is the result of “memory-efficiency” test.
The official results can be found here.
The library packed-statement is not a part of the official RDAB. It is an Apple and Orange comparison as per Frans Bouma, the author of RDAB. The ORMs included in that test are those who do the change/state tracking libraries only, whereas this library does not do that.
However, in my development environment, this library has also topped the “packed-insert” executions. The modified version of RDAB can be found on my Github account. Anyone can clone the repository and execute bencher to see the results locally.
This feature allow the developers to cache the result of the Query and QueryAll operations. By default, the MemoryCache internal class is being used by the library. The cache item expiration is 180 minutes.
To cache the result, simply pass a value to the cacheKey argument of the Query or QueryAll operations. See sample code below.
In the above code, the returned list of Product objects are being cached in the computer memory. The next time this code is being executed, the engine will then pick the result from the cache, not from the database, for the next 3 hours.
This scenario is very useful for the developers who wished to cache the “non-changing” or “look-up” data.
The developers can also create a customized version of cache-class by simply implementing the interface ICache. For further information about this, please visit the official documentation here.
The library has fully supported the SQL Server. The other database providers are also supported but only when using the “RawSql” operations. The raw SQL operations include the following methods: ExecuteQuery, ExecuteNonQuery, ExecuteReader, ExecuteScalar and the ExecuteQueryMultiple.
In order to extend the “fluent” support of the “method-based” calls, the developers need to implement necessary classes via required interfaces.
Supporting the other Data Providers
To fully support the Oracle data provider, the developers need to implement the following.
- OracleStamentBuilder that implements the IStatementBuilder interface.
- OracleDbOperationProvider that implements the IDbOperationProvider interface.
- OracleDbHelper that implements the IDbHelper interface.
Once implemented, the developer can then map the objects by calling the following mappings.
To fully support the other data providers, the same process must be done.
The approach to the actual development is TDD, and at some point in time, code-guards via post-code Unit Tests development. The library has 2300+ Integration Tests and 1200+ Unit Tests as of writing this.
Each test is covering different scenario and is written to test both Sync and Async operation.
Some test scenario may co-exists in IDbConnection, DbRepository and BaseRepository, but is targeting different code base.
Below are the links to be visited for this library.
- Project: https://github.com/mikependon/RepoDb
- Package: https://www.nuget.org/packages/RepoDb
- Documentation: https://repodb.readthedocs.io/en/latest/
This library is very simple and lightweight. It is still considered a micro-ORM, as it does not do any of the advanced features of the big ORMs like Change Tracking, State Tracking, etc.
This library is meant for developers who are comfortable using the micro-ORM features and are willing to use operations that tend to be present in the full-ORMs.
This library is built with performance, efficiency, extensibility and quality in mind at all costs.
I would be grateful as an author of this library to receive any feedback, recommendations and even bugs through this link. I am looking towards collaborating more with the .NET community.
Thank you so much for reading this article. /Michael