Dynamic batch-sizing for dataset with vastly diverse sequence lengths
A hacky way to keep the code structure consistent
Using serve from ray
With an aim to make my machine learning model usable for public I could successfully hosting with CPU support only using the the following code snippet.
from fastapi import FastAPI, File…