See https://blueprints.launchpad.net/linaro-android/+spec/linaro-android-integrate-lava-and-gerrit for related blueprint which answers some questions set forth in this doc.

Build-Tested Merge

We will set up a system that ensures that tip of each Android trunk always passes a test build as it is updated. This will be a step towards having trunks that are always releasable and continuously integrated.

Now that we have Gerrit for doing code reviews of Android changes we will trigger a build of accepted changes before doing the merge.

It is not about ensuring that everything always passes LAVA tests before it is accepted.

Rationale

We wish to make this change to reduce the integration and testing burden near the release, which will make that time less stressful, and lead to more time being spent on development.

The ideal would be to always have a tip that is releasable as is, without any testing. The first step to this is to ensure that it at least always builds when it is updated. Later this can be extended to include validation testing using LAVA, which will give greater confidence in the release status of the tip at any time.

Stakeholders

  • Android team
  • Release manager

User stories

As a release manager
I want to always know that the tip of each Android trunk always builds
so that I can have greater confidence that it is ready for release

As an Android Platform engineer
I want to trigger updates to tip by accepting changes in Gerrit
so that I don't spend time on the merges myself but can retain some confidence that they won't break the build

Constraints and Requirements

Must

  • Ensure that each update to tip passes a build test before it happens
  • Report problems back to the change(s) in Gerrit that were tested

Nice to have

  • Pointer to the build log is available for each update to tip for forensics
  • Report problems to the specific change(s) in Gerrit that caused the problem
  • Continue with merging any approved change(s) in Gerrit that do not break the build

Must not

  • add more than 1 hour latency to changes landing on tip
  • re-test changes that have caused a build failure without human intervention

Out of scope

  • Any integration with LAVA
  • Testing every revision in the history of the tip.

Success

How will we know when we are done?

  • There is a successful build log available for each tip change
  • The process to update the tip is entirely automatic from acceptance of a change in Gerrit
  • Build failures caused by an approved change are reported to that change in Gerrit

How will we measure how well we have done?

  • The number of build failures in trunk drops to near zero
    • TODO: how will we measure this?
  • The latency from accepted change to tip update is low

Design

We will run a script that will periodically scan Gerrit for approved changes.

  • TODO:
    • What API does Gerrit have for this?
      • Gerrit has an ssh api for queries. This query will find all approved changes:

        gerrit query --format=JSON --current-patch-set label:CodeReview+2

    • What will constitute an approved change for us
      • An approved change is one that has received a code-review +2

These approved changes will be merged in to staging trees, which will first be reset to the state of trunk.

  • TODO: does repo provide a mechanism for that reset?
    • Yes, we repo init and repo sync a baseline as usual, then we repo download the change to merge it into a staging tree.

If the merge fails then report that fact to the changes in Gerrit. Firstly by leaving a comment with the details of the merge failure. Second it should vote -1 on those changes. This vote must cause those changes to not be detected in the scan at the start of the process.

  • TODO: can other people remove the bot's -1? If not how will a new attempt be triggered?
    • Submission of an updated changeset resets the scores to 0. A new attempt will be triggered once the updated change scores +2

Those staging trees will be synced to a location accessible to the build system at a known URL.

The script will then trigger a build at android-build.linaro.org of those staging trees.

  • Staging trees are unnecessary, we can create this within android-build using repo download.

If the build passes the staging trees should overwrite the trunk trees.

  • If a build passes, we can submit a merge request through gerrit's ssh api:

    ssh -p 29418 android-build-bot@review.android.git.linaro.org gerrit review -m 'SUCCESS testing, see $TEST_URL' --verified=+1 --submit --project=%(project)s %(sha1)s

If the build fails then the script will report that fact to the changes in Gerrit. It will do this in the same way as it does for the merge failure above.

  • We can use the ssh api to mark merge/build/test failures and provide a message indicating what failed:

    ssh -p 29418 android-build-bot@review.android.git.linaro.org gerrit review --verified=-1 --code-review=-1 -m 'FAILURE compiling, see $BUILD_URL' --project=%(project)s %(sha1)s

TODO: should there be a delay in order to try and act on more than one change if they are accepted in quick succession? TODO: should this be parallelised at all?

Thoughts?

Does Gerrit have the concept of "pre-requisite changes"? If so then those will have to be carefully considered to avoid merging things before they are ready.

Needs working in and further investigation:

<james_w> I think there's one for Panda, one for Beagle etc.
<james_w> so I changed it to not be the definite article
<james_w> plus they aren't really single trees, so I had to change some more things
<james_w> I don't really have the vocabulary for this yet
<mwhudson> right, it was the "not single trees" part that confused me
<mwhudson> do you want to have this run for every landing to any tree that's referenced by any of the various builds' manifests?
<james_w> I think you land as one
<mwhudson> ah, gerrit is involved i guess here?
<james_w> you propose a single change across a bunch of trees, which is a combination of a bunch of new revisions to those trees, plus the update to the manifest to tie them all together
<james_w> yeah
<james_w> so I think we want to test that as one lump, basically do whatever is needed to to a test build with that new manifest
<mwhudson> in general i think the manifests point to the master branch of the trees
<james_w> yeah, that's my understanding
<mwhudson> so i don't think, in practice, the update to the manifest is necessary
<james_w> ah, I see
<mwhudson> if they had revnos, this would make more sense, perhaps :)
<james_w> so we'll have to a bit of manifest rewriting to make the build use the staging trees
<james_w> it will be something like:
<james_w> grab a copy of every tree that has new revisions in the change
<james_w> add those new revisions
<james_w> submit a build on a manifest that points to the staging locations of those trees
<james_w> if all succeeds overwrite the actual place the manifest points to for those trees
<james_w> I wonder how this will work with changes to external trees though, as I think "our" Android builds pull stuff from all over
<james_w> so they will have to have a process to fork an upstream tree and use it in the build
<james_w> it's clear to me I need to learn more about the Gerrit process :-)
<mwhudson> maybe we're only interested in landing changes on git.linaro.org branches?
<james_w> maybe yeah
<james_w> I think there will be some changes to manifests to change what external tree/revision is used, and we would want to test them
<mwhudson> yeah, that's certainly true
<james_w> there's no way we can gate upstream changes on this scheme, unless we have linaro-forked versions of everything
<james_w> I'll add this to the spec for further discussion, thanks
<mwhudson> yeah, or specify precise revisions in the manifest
<mwhudson> which is possible, and maybe even what they intend
<mwhudson> but my instinct is that isn't what is planned
<james_w> yeah, that's better actually
<james_w> yeah

internal/archive/Platform/Android/Specs/BuildTestedMerge (last modified 2013-08-29 09:03:27)