Doing more with less: Sync and Async support in one line of code.
How to support sync and async calling patterns in your Python methods with (almost) zero boilerplate.
More and more we’re seeing python libraries support both synchronous and asynchronous calling patterns in order to facilitate as many usage patterns as possible. Supporting both approaches means your library is usable in both simple client side programs, as well as high performance server environments where maximizing co-routine usage is critical.
The most salient example in the AI space is Langchain, where you find methods like:
invoke
andainvoke
generate
andagenerate
batch
andabatch
In Langchain specifically, the total code size for all the methods above amounts to ~3200 lines of code split roughly equally between the synchronous and asynchronous examples. While this isn’t a lot in the context of the entire langchain repository (roughly 44k lines of code as of this post), there’s certainly savings opportunities.
In working on Docprompt, I wanted to avoid having to duplicate sync and async call patterns in every possible interface that would utilize them — which would be most of the libraries abstractions. In fact, I wanted a way to add async support to any existing code in a backwards compatible manner.
Introduce @flexible_methods
Here’s the usage pattern:
@flexible_methods(
("process_document_node", "aprocess_document_node"),
("_invoke", "_ainvoke"),
)
class AbstractTaskProvider(BaseModel, Generic[TTaskInput, TTaskConfig, TTaskResult]):
...
That’s it! Instances of AbstractTaskProvider support both process_document_node()
as well as await aprocess_document_node
. In this case, we haven’t even implemented either method, and instead rely on the user to implement at least one in a downstream class.
A user can implement one or the other, depending on the nature of the task at hand. In scenarios where both must be implemented, the decorator is simply a non-op.
What happens in the case of inheritance? We simply traverse up the MRO chain via cls.__mro__
and find the “closest” implementation in either case.
def get_closest_attr(cls: Type, attr_name: str) -> Tuple[Type, Optional[Callable], int]:
closest_cls = cls
attr = getattr(cls.__dict__, attr_name, None)
depth = 0
if attr and hasattr(attr, "_original"):
attr = None
elif attr:
return (cls, attr, 0)
for idx, base in enumerate(cls.__mro__, start=1):
if not attr and attr_name in base.__dict__:
if not hasattr(base.__dict__[attr_name], "_original"):
closest_cls = base
attr = base.__dict__[attr_name]
depth = idx
if attr:
break
return (closest_cls, attr, depth)
This implementation is definitely opinionated. For example, what happens if we have three classes. A base, Base1, and Base2. Base1 implements an async method, but Base2 implements a “closer” sync method. Which one do we use? By default, Base2 being closer would trace precedece and use the Base2 implementation for sync and async calls.
The actual implementation of this method is pretty involved. Feel free to take a look here! I plan to use this decorator in this library, as well as future ones where “picking a side” (sync v async) is not feasible for the best user experience.
I’d love to hear your thoughts on this pattern, and how it could be extended to pure functions (if it even makes sense!), and other possible use cases.
Cheers!