OneFlow v0.9.0 Came Out!

Published in

CodeX

4 min readFeb 3, 2023

We are thrilled to announce the release of OneFlow v0.9.0. This update contains 640 commits. For the full changelog, please check out: https://github.com/Oneflow-Inc/oneflow/releases/tag/v0.9.0. Come install OneFlow v0.9.0 for a new user experience. Your feedback will be much appreciated!

Highlights and optimizations in this release:

1. PyTorch API compatibility

With the addition of 86 new API interfaces and operators aligned with PyTorch and the fix of 104 bugs related to operator compatibility, OneFlow v0.9.0 provides better PyTorch API and model compatibility. In v0.9.0, users can migrate more PyTorch models to OneFlow with one click and gain faster performance.

Allowing one-click migration of Stable Diffusion, GLM, YOLOv5 etc to OneFlow.
More convenient model migration. Oneflow.load supports loading the torch.save models directly.
With the newly added oneflow.mock_torch module and mock method（https://docs.oneflow.org/master/cookies/oneflow_torch.html）, oneflow can migrate complex PyTorch models containing multiple scripts with one click without changing the original PyTorch script.

2. Improving the usability of distributed programming

Global Tensor has added a series of interfaces and methods that are convenient for distributed programming. And related bugs have been fixed.

3. Supporting automatic parallelism

The Graph released a new feature of automatic parallelism (version 1), which supports automatic search for the fastest SBP with a specified Placement. When writing distributed models with Global Tensor, users do not need to consider parallelism model.

For more information, please check out: https://oneflow.readthedocs.io/en/master/auto_parallel.html

4. Better performance

Graph improves performance and reduces memory overhead, with a series of optimizations related to memory, execution speed, pipeline masking, and compilation speed.

A series of operator optimizations and system optimizations have been added, including Eager instruction scheduling, high-performance CUDA kernel, opening up of multiple memory pools, etc.

After simple tuning, GLM-Large (335M) pre-trained model based on OneFlow v0.9.0 can outperform the original GLM model based on PyTorch, DeepSpeed, and Apex with up to triple performance and 1/3 memory overhead saved.

On A100 GPU (SXM 80GB / PCIe 40GB), the OneFlow Stable Diffusion inference speed is the fastest compared with other deep learning frameworks or compilers.

5. Debugging

The Graph provides a series of functions to aid debugging, including analyzing memory logs, displaying the progress during the compilation stage, and the computation graph.

6. IR

OneFlow IR supports additional compilation optimization functions such as JIT compilation of LR code, distributed description of SBP signature, and the new OKL Dialect.

7. OneFlow-ONNX

The newly released OneFlow-ONNX version v0.6.0 enhanced the usability of the exchange interface with multiple new features. In addition, it added support for another 6 models and over 20 Ops and fixed 6 bugs during the transformation process. You can use pip install oneflow-onnx==0. 6.0 with just one-click.

Repository URL: https://github.com/Oneflow-Inc/oneflow_convert

8. Better error prompt

The error prompt of OneFlow is more user-friendly, which supports highlighting the error content and simplifies unnecessary information details inside the system. In this connection, you can visually learn about the location and type of the error.

Check out the link below for the full version of OneFlow v0.9.0 updates: https://github.com/Oneflow-Inc/oneflow/releases/tag/v0.9.0

Many thanks to the following contributors:

liujuncheng, BBuf, wyg1997, jackalcooper, Flowingsun007, clackhan, daquexian, marigoold, lixinqi, guo-ran, hjchen2, strint, ouyangyu, MARD1NO, small1945, reygu, Ldpe2G, leaves-zwx, Yipeng1994, zhongshsh, lixiang007666, mosout, chengtbf, hhhfccz, doombeaker, howin98, xiacijie, farmerzhang1, shangguanshiyuan, JasonChen9, liufengwei0103, youxiudeshouyeren, laoliu97, EsdeathYZH, rejoicesyc, AsakusaRinne, LijunZhang01, Chenqll, xiezipeng-ML, simonJJJ, ShawnXuan