Day 9: Tips and Tricks for Ray APIs

Jules S. Damji
3 min readSep 1, 2022

--

source: https://pixabay.com/photos/tips-tricks-tips-and-tricks-4905013/

Well, I’m back after a short respite, as we marched towards Ray 2.0. Day 9 of #30DaysOfRay explores some tips and tricks and Ray gotchas that get beginners, explained with some code examples. The Ray docs for Ray 2.0 have undergone a herculean effort to include code snippets, examples, user guides, concepts, and consistent table of contents. Tips, tricks, and gotchas and design patterns and anti-patterns are illustrations of these efforts. Below is a summary of these tips and tricks, with links to code samples in the documentation.

At Ray Summit last week, I covered bits of it as part of the training class: “Introduction to Ray for distributed applications.” The notebook link is included below, so are links to the documentation for each of the tips and days 1–8 of #30DaysOfRay.

Tip 1: Delay ray.get() until you need the results

Tip 2: Avoid tiny Ray remote tasks

Tip 3: Avoid passing same object repeatedly to remote Ray tasks

Tip 4: Avoid waiting for all Ray tasks to finish for data processing its results

Tip 5: Gotchas to be aware of environment variables, filenames, and placement groups

💁‍♀️Tip 1: Delay ray.get() as much as possible. Or use it only when needed, and avoid using it in a loop. Always keep in mind that ray.get() is a blocking operation, and if called eagerly it can hurt the parallelism. Instead, you should try to write your program such that ray.get() is called as late as possible.

💁‍♀️Tip 2: Small tasks add an overhead of scheduling, interprocess communication, and updating system meta data. As a result, often Python’s equivalent will be faster than the Ray’s distributed. Better to make Ray tasks larger, more compute intensive, which then amortizes its parallelism overtime. Here is the link to code example that demonstrates making tiny tasks larger to amortize merits of parallelism.

💁‍♀️ Tip 3: When you pass a large object as an argument to a remote function, Ray calls ray.put() under the hood to store that object in the local object store. Done once with the same object, say outside a loop, can significantly improve the performance of a remote task invocation when the remote task is executed locally, or if running on the same cluster node, as all tasks on the same node share the object store in shared memory. But if done by passing the same object repeatedly inside a loop can degrade the performance. Take a look at this code example here.

💁‍♀️Tip 4: When you have large number of tasks, in 1000s or 10,000s, and each task can vary in the time it takes to finish, waiting on all to finish with ray.get() on their object IDs to process the results can take a long time. One way to avoid this long waiting for all tasks to finish is doing data pipelining, where you process the tasks that have finished already in a loop while others are still being executing. Here is the link to the code that illustrates how to pipeline data processing for many tasks.

Figure 1: (a) Execution timeline when using ray.get() to wait for all results from do_some_work() tasks before calling process_results(). (b) Execution timeline when using ray.wait() to process results as soon as they become available.

💁‍♀️Tip 5: Lastly, avoid simple Ray gotchas. Read about how to deal with environment variables, filenames, and placement groups across tasks and actors in a Ray cluster.

Read days 1–8 of #30DaysOfRay if you missed them and the Tips & Tricks notebook.()

Conclusion

Let’s sum up:

  1. Delay using ray.get() until you need it.
  2. Avoid tiny Ray remote tasks because they add overhead.
  3. Better to pass an object reference to large data if the same data is being used by the remote Ray task.
  4. Use ray.wait() on large number of tasks, and only process finished tasks, taking advantage of data pipelining.
  5. Don’t always assume local environment variables or filenames paths are available to workers running tasks on the Ray cluster nodes.

--

--

Jules S. Damji

Developer at heart; Advocate by nature | Communicator by choice | Avid Arsenal Fan| Love Reading, Writing, APIs, Coding | All Tweets are Mine