Detecting Inclusive Language in My Codebase With Sonatype Lift

June 27, 2022 By Theresa Mammarella

3 minute read time

This past Monday Sonatype employees around the world took the day off of work to reflect on the American holiday of Juneteenth. One of the discussions on our internal Slack was around the subject of inclusive language and its impact in the workplace. We concluded that continually reexamining our unconscious bias is key. This isn't always easy especially when terms like "master" and "slave" have been baked into computer networking and architecture models for decades.

In recent years, GitHub took steps to remove some of these terms from the platform, specifically renaming the default branch to "main". But surely there's more we can do to help eradicate the use of non-inclusive terms from the software industry.

This week I had a chance to play around with Lift's extensibility API to address just this issue. This feature allows me to add custom static analysis tools to run on my GitHub pull requests. I found an open-source tool written by an engineer at Datadog: Woke. Woke is a linting tool that searches code bases for language that is not inclusive.

The Lift documentation provides a really simple set of functions that need to be defined to add a custom tool:

  • version: Returns the version of the tool.
  • applicable: Determines if the custom tool should be run on this particular code base. For example, some tools may apply only to specific programming languages.
  • run: Analyzes the code base and returns any findings in the form of the following JSON object:
{ "type" : <string>,
"message" : <string>,
"file" : <string>,
"line" : <int>,
"details_url" : <optional string>

Using these guidelines, I wrote a simple script that downloads the Woke project, runs it on the current repository, and converts the output to Lift consumable JSON. When I ran Lift on my simple Hello World demo and a pull request this was the result:

Screen Shot 2022-06-23 at 4.15.07 PM
From the Lift dashboard:



Woke can also be customized around the terms it searches for.

To try it out on your own projects, simply reuse my .lift/woke file and add the following line to .lift/config.toml in your projects' root directory: 

customTools = [ ".lift/woke" ]

To make your own custom plugin, choose your favorite static analysis tool and follow the API above. More details and examples, including simple test cases can be found here.

What other types of language would you like to see detected on pull requests? Profanity is another one that comes next to my mind to help make open-source projects more welcoming for everyone.

Tags: DevZone

Written by Theresa Mammarella

Theresa (she/her) is a developer advocate and open source contributor for Java runtimes and compiler projects.