Stop the low-quality contribution plague

October 20, 2022 By Eddie Knight

5 minute read time

 

I was in a discussion thread with folks from Cloud Native Computing Foundation and Kubernetes today, and this phrase came up again. The context was something along the lines of… "we're afraid that will encourage low-quality contributions."

Let's dig in to this. Also, since we're overusing this phrase today, let's just abbreviate low-quality contributions as "LQC" for the rest of this article.

The reason LQC are such a plague is because most people are used to committing code, but new open source contributors aren't used to reviewing code for a project with a large committer base.

For this article, I'm going to look at ways you can limit the number of LQC you receive and even more ways to stop making them yourself.

But before we make the list, we need to understand some of the most common culprits that earn the badge of "LQC."

What an LQC isn't…

  • A PR from an unknown person.
  • A long-lived, draft PR.
  • A PR that needs improvements.
  • A PR that needs a lot of improvements.

Sure, those things are tedious, but they don't make a PR low-quality in the sense that we’re talking about here.

What an LQC is…
I'm sure this isn't an exhaustive list, but here are the first four culprits that come to mind:

  1. Work that is contributed with no context for the reviewer(s), or no confirmation the work is needed at this time.

  2. Work beyond the scope or vision of the project.

  3. A pull request with poor hygiene.

  4. Code or documentation that doesn't follow established best practices.

How maintainers can reduce LQCs

You don't need to be reminded, but I'll mention it again: Contributors are almost universally just trying to help. Remember to assume positive intent, and it'll do wonders for your mental health (and communication).

But we still need to reduce the frequency of these well-intentioned LQCs.

So my advice for you is to take advantage of existing tools and documentation best practices. For example, check out GitHub's tooling that reminds contributors to review the docs before making a PR.

The next, more complicated option is to add some automation. Kubernetes has the most robust example of this that I’ve seen. They automate the reviewer assignment, label assignment, and explanation comments. All this in addition to the normal CI checks we would expect.

Here's an example of a draft PR I created over the weekend where you can see how these automation steps help reduce the workload on reviewers:

article-low-quality-contributions-convo

How committers can stop spreading the LQC plague

Let's go through each of the culprits, with suggested fixes.

1. Work that is contributed with no context for the reviewer(s), or no confirmation the work is needed at this time.

The solution is just to do the opposite of the problem. Give more information to the reviewers!

This can be done through the project's work tracking tools (such as GitHub Issues) or in discussion on their normal community channels. I usually recommend creating a GitHub issue and then finding the appropriate slack channel to reach out to. This allows the community to give feedback on your suggestion before you waste time building the solution.

Oh, and you'll avoid potentially duplicating work that someone else is already part-way through.

You can save a lot of work by discussing your changes before making the PR. But if it's a small change… you may just want to skip straight to the code. If you do that, be sure to provide clear, concise, helpful descriptions of both the problem and your proposed solution in the PR.

2. Work beyond the scope or vision of the project.

This can be mitigated through discussion as well. Really just follow all the advice I listed above, but avoid going directly into a pull request if your change might be outside of the normal scope of the project.

3. Bad pull request hygiene.

Read the contribution guidelines, and always make a draft PR first!

Especially look for the following in the CONTRIBUTING.md file:

  • A specific way of formatting commits or PR titles
  • Tests that must be created or modified
  • Recommended steps for getting a PR reviewed

4. Code or documentation that doesn’t follow established best practices.

Okay come on now. You really don't need me to tell you this, right? Look up the best practices for the language you're working in such as PEP8 for Python.

Additionally, documentation will likely have different "best practices" for every project. Take a look around to find out what the reviewers will expect to see documented, and make it look like it was written by the same person who wrote the rest of the docs.

Oh… and again, I shouldn't need to tell you this… but restrict your PRs to one feature/fix/refactor. Creating multiple PRs is fine if you've discussed the changes with the community ahead of time,  but big PRs are never okay.

Conclusion
The best way to avoid spreading the LQC plague is to just put yourself in the shoes of your reviewer. Imagine that you're really busy with your day job and you are taking a look at GitHub on your lunch break. What would you like to see?

You probably would get a headache if you are surprised by a new massive change request from a stranger on your project. But… you’d probably get a boost of energy if you see someone coming in and asking if they can help apply a change that would benefit the entire project!

Stop spreading the plague. Be the person who gives energy to open source maintainers.

Tags: Open Source, How-tos, Opinion Post, DevZone

Written by Eddie Knight

Eddie Knight is a Software and Cloud Engineer on Sonatype's Developer Relations team. With a background in fintech, he regularly works on open source software, including contributions to CNCF projects and working as a maintainer for the Compliant Financial Services project in FINOS. He also enjoys spending time being terrible at woodworking and creative writing.