Search

Automating PR merging with Bulldozer and PolicyBot

As you might have read in our previous post, we’re happily using Scala Steward for our public and internal Scala projects. It is a bot that keeps them up to date by opening Pull Requests with updated versions of dependencies. That alone is a great help, but it’s only half of the job, a developer has to come in, verify the PR that it makes sense and click the Merge button – it is still not fully automated. In this post, we will explore ways to increase automation regarding PRs and thus decrease the burden on developers.

For our public projects, we use Mergify, a free service for automation of merging PRs. Check for example these projects: SST, grpc-json-bridge. For projects hosted on GitHub.com, it’s something we can greatly recommend, but what could we use for our internal GitHub Enterprise? Besides Scala Steward, we’re using Renovate for non-Scala projects, which can merge its PRs, but we would prefer to have just one tool for automated merging. That’s where Bulldozer and PolicyBot come in.

Bulldozer

Developed by Palantir and released as Free software, Bulldozer is a bot, a GitHub App to be more precise, for merging Pull Requests. It doesn’t have a publicly hosted instance, like Mergify for example, but it is published as a container image, which makes it easy to deploy internally at our organization.

We deployed it to our internal Kubernetes. The app itself is configured via environment variables. This is an excerpt from the Kubernetes deployment configuration:

      containers:
      - name: app
        image: palantirtechnologies/bulldozer:1.10.1
        ports:
          - containerPort: 8080
            name: http
        resources:
          requests:
            cpu: "100m"
            memory: "4000Mi"
          limits:
            cpu: "200m"
            memory: "8000Mi"
        env:
          - name: GITHUB_WEB_URL
            value: "https://git.company.com"
          - name: GITHUB_V3_API_URL
            value: "https://git.company.com/api/v3/"
          - name: GITHUB_APP_INTEGRATION_ID
            value: "123"
          - name: GITHUB_APP_WEBHOOK_SECRET
            valueFrom:
              secretKeyRef:
                name: bulldozer
                key: github_app_webhook_secret
          - name: GITHUB_APP_PRIVATE_KEY
            valueFrom:
              secretKeyRef:
                name: bulldozer
                key: github_app_private_keyCode language: JavaScript (javascript)

Note that GITHUB_APP_INTEGRATION_ID is what GitHub calls “App ID”. To get the id, Bulldozer first has to be registered as an App to your GitHub team. The link where to do that could look something like https://git.company.com/organizations/your-team/settings/apps and there you click the New GitHub App button in the top right corner. The specific details of what permissions are necessary and other installation related things can be found in the Bulldozer’s official README.

After you have Bulldozer App registered and the container up and running, you can install it for specific repositories or whole organizations in the Install App tab in the App’s administration panel (https://git.company.com/organizations/your-team/settings/apps/bulldozer/installations). We installed it for whole organizations, because it’s comfortable and safe, because the bot won’t touch repositories that don’t have the .bulldozer.yml configuration file in the root.

Bulldozer’s single purpose is to merge PRs. It merges only those PRs for which the required status checks pass and that satisfy certain criteria. The important criteria for us were labels and pr_body_substrings and the build checks are the TeamCity CI and PolicyBot (more on that later). For a Scala project, the .bulldozer.yml configuration file could look like this:

version: 1
merge:
  required_statuses:
    - "Build (<Project Name On TeamCity>) - merge"
    - "policy-bot: master"
  whitelist:
    labels: [ "bulldozer-merge" ]
    pr_body_substrings:
      # Scala Steward's messages
      - "labels: library-update, semver-minor"
      - "labels: library-update, semver-patch"
      - "labels: sbt-plugin-update, semver-minor"
      - "labels: sbt-plugin-update, semver-patch"
      - "labels: scalafix-rule-update"
      - "labels: test-library-update"
      # Renovate messages
      - "| patch |"
      - "| minor |"
  blacklist:
    labels: [ "bulldozer-do-not-merge" ]
  method: squash
  options:
    squash:
      body: "pull_request_body"
  delete_after_merge: true
update:
  whitelist:
    labels: [ "bulldozer-update" ]
  blacklist:
    labels: [ "bulldozer-do-not-update" ]
Code language: PHP (php)

Notice how we tailored the pr_body_substrings to the messages Scala Steward puts into its PRs, so that everything besides major upgrades of libraries will get automatically merged. We also have the bulldozer-merge label. That is a very useful feature (unrelated to dependency bots), if you open a PR and don’t want to wait for the CI to finish. Just add this label to the PR and right after all necessary checks are green, Bulldozer will merge the PR for you, you don’t have to go back to it. If you are using Restrict who can push to matching branches in Branch protection rules, make sure that Bulldozer is allowed push access there.

As you can see, Bulldozer can be very powerful, but maybe even too powerful. It could wreak havoc on your repositories if something went really really wrong – by merging PRs, it’s changing the code in your master branch after all. It most likely won’t, it is developed by competent developers and it has even been vetted by our internal security team, but still, it’s better to be safe than sorry. Isn’t there a way to further curb the possibilities of what Bulldozer can and cannot do, what files it is allowed to change?

PolicyBot

This is Bulldozer’s sidekick. It is developed by the same team, deployed and configured in the same manner. The two bots can work independently, one without the other, but they play very well together. What PolicyBot does is create a new kind of check in a GitHub PR that can be red or green based on a very granular criteria, like the full file names of the changed files.

As Bulldozer, PolicyBot is easy to deploy to k8s:

      containers:
      - name: app
        image: palantirtechnologies/policy-bot:1.20.0
        ports:
          - containerPort: 8080
            name: http
        resources:
          requests:
            cpu: "100m"
            memory: "4000Mi"
          limits:
            cpu: "200m"
            memory: "8000Mi"
        env:
          - name: GITHUB_WEB_URL
            value: "https://git.company.com"
          - name: GITHUB_V3_API_URL
            value: "https://git.company.com/api/v3/"
          - name: GITHUB_V4_API_URL
            value: "https://git.company.com/api/graphql/"
          - name: GITHUB_APP_INTEGRATION_ID
            value: "321"
          - name: GITHUB_OAUTH_CLIENT_ID
            value: "Iv1.asdf"
          - name: POLICYBOT_PUBLIC_URL
            value: "https://policy-bot.k8s.company.com"
          - name: GITHUB_OAUTH_CLIENT_SECRET
            valueFrom:
              secretKeyRef:
                name: policy-bot
                key: github_oauth_client_secret
          - name: GITHUB_APP_WEBHOOK_SECRET
            valueFrom:
              secretKeyRef:
                name: policy-bot
                key: github_app_webhook_secret
          - name: GITHUB_APP_PRIVATE_KEY
            valueFrom:
              secretKeyRef:
                name: policy-bot
                key: github_app_private_key
          - name: POLICYBOT_SESSIONS_KEY
            valueFrom:
              secretKeyRef:
                name: policy-bot
                key: policybot_sessions_keyCode language: JavaScript (javascript)

Note that GITHUB_OAUTH_CLIENT_ID is what GitHub calls “Client ID”. Register the app first with GitHub and then install it for repositories or organizations. More details about the settings, deployment and other configuration is at PolicyBot’s README. List of changed files is only a small fraction of what it can do, so check out the documentation, maybe you will be able to employ the bot for other purposes besides guarding version upgrade PRs.

The configuration file is .policy.yml and only projects that contain it will get this new PR check. For Scala projects, we use configuration that looks something like this:

# the high level policy
policy:
  approval:
    - or:
        - update of dependencies by Scala Steward
        - update of dependencies by Renovate
        - change by a member of the organization
        - change by a foreigner

# the list of rules
approval_rules:
  - name: update of dependencies by Scala Steward
    if:
      only_has_contributors_in:
        users: [ "scala-steward" ]
      only_changed_files:
        paths:
          - "project/Versions.scala"
          - "project/project/PluginVersions.scala"
          - "project/build.properties"
          - ".scalafmt.conf"
    requires:
      count: 0
  - name: update of dependencies by Renovate
    if:
      only_has_contributors_in:
        users: [ "renovate" ]
      only_changed_files:
        paths:
          - ".teamcity/pom.xml"
    requires:
      count: 0
  - name: change by a member of the organization
    if:
      only_has_contributors_in:
        organizations: [ "your-team" ]
    requires:
      count: 1
  - name: change by a foreigner
    requires:
      count: 1
      write_collaborators: true
Code language: PHP (php)

You can see that we limit the files in a Scala Steward’s PR that Bulldozer can merge to only those that contain versions of libraries, sbt plugins, sbt itself or of Scalafmt. If Scala Steward changed anything else, the PolicyBot’s check would be red and Bulldozer wouldn’t merge such PR. (By the way, that can happen for legitimate reasons, because Scala Steward can apply Scalafix rules when updating certain libraries, but those result in changes to the code so we want a developer to have a look at it before merging.) Something similar could be said about Renovate that is keeping our TeamCity configuration for the project up to date.

PolicyBot comes with a nice web UI, where it explains how it came to the conclusion it made. You can get to it by clicking Details on the right side of the policy-bot: master row in the PR checks. Here’s an example of how it can look:

Conclusion

Bulldozer and PolicyBot have saved us developers a ton of onerous work since we’ve deployed them. We hope that this post has inspired you to think about how you can automate your PR workflow. We owe a big thank you to the nice folks at Palantir who developed these two bots, we couldn’t have done this without them. They were also receptive to our improvements (1, 2, 3), so thanks again, @bluekeyes in particular!

To automate PRs to the maximum extent possible and to do that fearlessly, we should talk about preventing runtime failures caused by linking incompatible transitive dependencies and ClassNotFoundException or preventing PRs that could introduce such issues from being merged. But those are issues specific to Scala/JVM and this post is already long enough, so we will explore this in a future post, where we will talk about sbt-missinglink.