Bringing the nodejs/help backlog down by 90% over 2 months, and best practice recommendations for raising issues
This post is from Gireesh Punathil who is a Node.js Collaborator, and the Service Architect for IBM Runtime technologies. His work is focussed around improving Node.js user experience. It originally appeared here.
A little while back I noticed that there was a relatively large backlog of about 200 issues in the nodejs/help repo (while the main node.js/node repo issue tracker covers bugs and issues found within Node.js core, general support with Node.js is covered under this repo) and decided to see what I could do to help bring that number down.
With the high degree of collaboration and information exchange typical in the Node.js community, we were able to bring down the backlog from ~200 to ~20 in a span of 2 months!
While reflecting on that success, I want to share some of the observations from this effort that will help both Node.js end users or consumers as well as the Node.js community itself.
I hope to answer the following questions:
- What are the most common types of issues reported?
- What are the most common reasons for issues?
- What are the most common paths to resolution?
What are the most common types of issues reported?
While the linked items are representatives of common issues, they are not necessarily in any particular order.
- child_process: signaling, data flow, life cycle. A recent problem pattern is broken data flow when spawning a hierarchy of child processes mixed with Node.js and non-Node.js processes — typically OS commands that pre-date Node.js and that do not expect their stdios to be non-blocking.
- install: installation / reinstallation / uninstallation. While the Node.js install bundle embodies one of the most compact binaries for a language runtime, users have trouble associating / managing their Node.js installation in conjunction with
- npm: third party module not installing or not functioning. Probable causes: the platform does not support it, the build toolchain is missing / incompatible for the version, or the module resolution fails for the install location.
- doc: mismatch in the expectation (doc) and observations (actuals) on APIs. The documentation walks a thin line between being more descriptive for clarity and being more compact for discovery. This causes challenges for some users in certain document sections.
- stream: tapping and piping stream data, event orders and lifecycle. General clarifications around feeding data to streams, consuming or transforming stream data, managing back-pressure, and lifecycle events and their ordering.
- build, deploy: platform support, build toolchain , runtime linking. Problems pertinent to building and running Node.js on newer platforms, problems that appear when building native addons or Node.js itself on hosts that have different compilers and / or runtime libraries.
- addon: cross-language binding with native addons, n-api. Integrating JS code with native addons with varying invocation and consumption models.
- shared lib: embed Node.js in C++ apps. Consuming Node.js for sharing I/O workload in heterogenous deployments. In some cases, wanting to couple and decouple Node.js at different levels of abstractions and capacity.
- crypto: custom cryptographic extensions and plugins. Issues with shared (system-supplied) SSL library, issues with configuring custom SSL engine, a need to custom build Node.js for FIPS compliance.
What are the most common reasons for issues?
- Architecture: Code built with linear programming paradigm breaks in asynchronous flow. For example, a user complains regarding a particular code sequence being executed, or not being executed, which is unexpected for them.
- User background: Code breaks due to incorrect perception about Node.js API abstractions. For example, a user has a program that closed the underlying descriptor of a Node.js stream through external means, and was expecting the stream to emit its lifecycle events.
- Deployment models: Code breaks in untested (or unforeseen and mostly unsupported too) deployment environments. For example, some UDP capabilities are unavailable when transforming the Node.js source for a browser target.
- Integration: Code breaks while trying to connect with heterogenous, polyglot endpoints.
streamare layers that usually interact with polyglot endpoints, and because of the plurality of conventions that exist at the remote endpoints, inconsistency can arise due to incorrect usage.
- Documentation: Code does not behave according to written doc, or doc is not explicit
What are the most common paths to resolution?
- Question answered and help provided to OP’s satisfaction. In many cases, a single crisp answer addresses the problem / question, though in a few cases, explanation with sample code and output was required. By far the most common case.
- Revealed an issue / enhancement in Node.js (code or doc), fix provided. Rarely, the investigation reveals that it is a bug in the Node.js core, and that leads to a PR being raised and contributed to the project.
- Revealed an issue / enhancement in Node.js (code or doc), OP contributes the fix! This is great, and the originator (and the community) really get excited about it.
- Revealed an issue / requirement, but outside Node.js’s scope. This happens when the user’s stack involves Node.js, but the issue is not within Node.js’s purview. So this is referred elsewhere as appropriate. If it is something reusable but does not necessarily need to go into the core, user land modules (external reusable modules, mostly
npm) are developed.
- No longer a problem, fixed in latest Node.js releases. The recommendation here is to always try with the latest LTS release before you raise an issue. Many known issues are resolved in the latest release, and the LTS is the most mature release.
- OP no longer interested in the support, or unavailable. This is a concern — for example a good amount of work would have been spent on problem determination by both parties, and when one side is unavailable for further progress or clarification, it causes a waste of time and effort.
Recommendations and Best practices
To get help and support by best leveraging the skill and collaboration among the community, I recommend the following best practices while opening / engaging on issues:
Concepts users should gain familiarity with
- Understanding your application stack will help to identify whether the desired support is pertinent to Node.js or elsewhere in the ecosystem.
- Understanding the execution environment of Node.js (asynchronous programming model with event driven architecture) will help quickly ratify some of the issues and make amendments accordingly. This is more important for users with other language / platform backgrounds.
- While Node.js abstracts the underlying platform and provides a unified programmer experience, understanding any unique traits of your platform (path separators in Windows, I/O buffer size in Darwin, process-scheduling order in AIX etc.) will help you to isolate problems yourself.
How to raise an issue effectively
- Fill out the issue template completely. If you don’t know what to provide in some fields, state so.
- Present your problem clearly. Ideally, add a minimal test case to reproduce the problem, a statement that shows what is expected, and output that shows the actual result.
- Detecting and reporting issues early in your development phase helps to reduce the amount of refactoring or redesigning of your application that may be required. These changes can be difficult to do later, depending on where the problem is and the resolution path that is identified.
- Be available for clarifying any questions that may come up, and while debugging. Community members frequently do not have the same deployment scenario as you, so we depend on you for the problem determination! Issues get closed due to inactivity.
- Use the Node.js API documentation; it’s content-rich as well as end-user focused. You might be able to solve many of your problems by referring to the doc.
- While it is not enforced, it is much appreciated if you are willing to contribute to the project, if need be, based on the outcome.
Recommendations for collaborators working on these issues
- Ask questions in the beginning about the execution environment in which the issue occurs, to help to isolate the problem context.
- Many issue reports or help requests appear with few or no clues about the requirement, due to the background of the user: understanding about the problem, programming experience, interaction with community, language barriers etc. A couple of iterations will help to gain clarity and refine the problem scope.
- Many users arrive in the
nodejs/helprepo through a notion of “general support for any issues while using the Node.js stack”. Often, we may need to direct users elsewhere, but provide some information on the relevant project area / repo / vendor to seek support with.
- Often it is possible to represent the problem or the requirement in the form of a small test case. Writing one with the program output helps to quickly blow away many gray areas and ratify the understanding of the parties involved.
- We use many methodologies for problem determination, and many tools to debug. Recording the diagnostic steps as well as the actual commands used helps users to relate to those methodologies and tools, and adopt them without further education.
- Linking to other discussions that are related, linking to the source code blocks that are relevant, and quoting the documentation that supports the statement helps users to figure out how and where to look for information for future problems.
- Copy known experts and SMEs in the community if you need deeper and/or design-level discussion. Collective intelligence is the most notable advantage in an open source project.
- Finally, focusing on the problem statement (the engineering side of the collaboration) alleviates frictions or shortcomings on the social side, and helps to conclude the work item with a high level of constructive engagement.
Keeping the issue backlog within a reasonable threshold is a key characteristic of a healthy community and the ecosystem that flourishes around it. I request consumers and contributors of this repository to engage in a meaningful, timely, and constructive manner for resolving identified usage pain points, making sure that the design follows the documented behavior, and leveraging this engagement for identifying potential enhancements to the core for its future roadmaps — mid-term and long-term.
Looking to contribute to the Node.js project?
- Start with the doc: Look at the documentation from a consumer’s perspective, and identify if there are improvements possible.
- Explore the tests: Tests provide the next level of insight into the APIs after the doc. See if you can detect a case (a control flow in the API) that is not covered by the tests, or is covered in an incomplete manner. Look around for flaky tests and pick up / fix easy ones. See if you can add new test cases. See if you can add comments around complex constructs.
- Look at the good-first-issues. Pick one that is right for you, and make a comment in it about your willingness to contribute. Ask for a mentor, if need be. State your intent and design / approach early, to avoid the PR stalling.