Contributing Guidelines

Read the following guide if you’re interested in contributing to cluster-api.

Contributors who are not used to working in the Kubernetes ecosystem should also take a look at the Kubernetes New Contributor Course.

Contributor License Agreements

We’d love to accept your patches! Before we can take them, we have to jump a couple of legal hurdles.

Please fill out either the individual or corporate Contributor License Agreement (CLA). More information about the CLA and instructions for signing it can be found here.

NOTE: Only original source code from you and other people that have signed the CLA can be accepted into the *repository.

Finding Things That Need Help

If you’re new to the project and want to help, but don’t know where to start, we have a semi-curated list of issues that should not need deep knowledge of the system. Have a look and see if anything sounds interesting. Before starting to work on the issue, make sure that it doesn’t have a lifecycle/active label. If the issue has been assigned, reach out to the assignee. Alternatively, read some docs on other controllers and try to write your own, file and fix any/all issues that come up, including gaps in documentation!

If you’re a more experienced contributor, looking at unassigned issues in the next release milestone is a good way to find work that has been prioritized. For example, if the latest minor release is v1.0, the next release milestone is v1.1.

Help and contributions are very welcome in the form of code contributions but also in helping to moderate office hours, triaging issues, fixing/investigating flaky tests, being part of the release team, helping new contributors with their questions, reviewing proposals, etc.

Versioning

Codebase and Go Modules

⚠ The project does not follow Go Modules guidelines for compatibility requirements for 1.x semver releases.

Cluster API follows upstream Kubernetes semantic versioning. With the v1 release of our codebase, we guarantee the following:

  • A (minor) release CAN include:

    • Introduction of new API versions, or new Kinds.
    • Compatible API changes like field additions, deprecation notices, etc.
    • Breaking API changes for deprecated APIs, fields, or code.
    • Features, promotion or removal of feature gates.
    • And more!
  • A (patch) release SHOULD only include backwards compatible set of bugfixes.

These guarantees extend to all code exposed in our Go Module, including types from dependencies in public APIs. Types and functions not in public APIs are not considered part of the guarantee. The test module, clusterctl, and experiments do not provide any backward compatible guarantees.

Backporting a patch

We generally do not accept PRs directly against release branches, while we might accept backports of fixes/changes already merged into the main branch.

Any backport MUST not be breaking for either API or behavioral changes.

We generally allow backports of following changes to all supported branches:

  • Critical bugs fixes, security issue fixes, or fixes for bugs without easy workarounds.
  • Dependency bumps for CVE (usually limited to CVE resolution; backports of non-CVE related version bumps are considered exceptions to be evaluated case by case)
  • Cert-manager version bumps (to avoid having releases with cert-manager versions that are out of support, when possible)
  • Changes required to support new Kubernetes versions, when possible. See supported Kubernetes versions for more details.
  • Changes to use the latest Go patch version to build controller images.
  • Changes to bump the Go minor version used to build controller images, if the Go minor version of a supported branch goes out of support (e.g. to pick up bug and CVE fixes). This has no impact on folks importing Cluster API as we won’t modify the version in go.mod and the version in the Makefile does not affect them.

We generally allow backports of following changes only to the latest supported branch:

  • Improvements to existing docs (the latest supported branch hosts the current version of the book)
  • Improvements to CI signal
  • Improvements to the test framework

While we recommend to target following type of changes to the next minor release, CAPI maintainers will also consider exceptions for backport of following changes only to the latest supported branch:

  • Enhancements or additions to experimental Cluster API features, with the goal of allowing faster adoption and iteration; Please note that stability of the branch will always be top priority while evaluating those PRs, and thus approval requires /lgtm from at least two maintainers that, on top of checking that the backport is not introducing any breaking change for either API or behavior, will evaluate if the impact of those backport is limited and well-scoped e.g. by checking that those changes should not touch non-experimental code paths like utils and/or by applying other considerations depending on the specific PR.

Like any other activity in the project, backporting a fix/change is a community-driven effort and requires that someone volunteers to own the task. In most cases, the cherry-pick bot can (and should) be used to automate opening a cherry-pick PR.

We generally do not accept backports to Cluster API release branches that are out of support.

APIs

API versioning and guarantees are inspired by the Kubernetes deprecation policy and API change guidelines. We follow the API guidelines as much as possible adapting them if necessary and on a case-by-case basis to CustomResourceDefinition.

CLIs

Any command line interface in Cluster API (e.g. clusterctl) share the same versioning schema of the codebase. CLI guarantees are inspired by Kubernetes deprecation policy for CLI, however we allow breaking changes after 8 months or 2 releases (whichever is longer) from deprecation.

Branches

Cluster API has two types of branches: the main branch and release-X branches.

The main branch is where development happens. All the latest and greatest code, including breaking changes, happens on main.

The release-X branches contain stable, backwards compatible code. On every major or minor release, a new branch is created. It is from these branches that minor and patch releases are tagged. In some cases, it may be necessary to open PRs for bugfixes directly against stable branches, but this should generally not be the case.

Support and guarantees

Cluster API maintains the most recent release/releases for all supported API and contract versions. Support for this section refers to the ability to backport and release patch versions; backport policy is defined above.

  • The API version is determined from the GroupVersion defined in the top-level api/ package.
  • The EOL date of each API Version is determined from the last release available once a new API version is published.
API VersionSupported Until
v1beta1TBD (current stable)
  • For the current stable API version (v1beta1) we support the two most recent minor releases; older minor releases are immediately unsupported when a new major/minor release is available.
  • For older API versions we only support the most recent minor release until the API version reaches EOL.
  • We will maintain test coverage for all supported minor releases and for one additional release for the current stable API version in case we have to do an emergency patch release. For example, if v1.2 and v1.3 are currently supported, we will also maintain test coverage for v1.1 for one additional release cycle. When v1.4 is released, tests for v1.1 will be removed.
Minor ReleaseAPI VersionSupported Until
v1.7.xv1beta1when v1.9.0 will be released
v1.6.xv1beta1when v1.8.0 will be released
v1.5.xv1beta1EOL since 2024-04-16 - v1.7.0 release date
v1.4.xv1beta1EOL since 2023-12-05 - v1.6.0 release date
v1.3.xv1beta1EOL since 2023-07-25 - v1.5.0 release date
v1.2.xv1beta1EOL since 2023-03-28 - v1.4.0 release date
v1.1.xv1beta1EOL since 2022-07-18 - v1.2.0 release date (*)
v1.0.xv1beta1EOL since 2022-02-02 - v1.1.0 release date (*)
v0.4.xv1alpha4EOL since 2022-04-06 - API version EOL
v0.3.xv1alpha3EOL since 2022-02-23 - API version EOL

(*) Previous support policy applies, older minor releases were immediately unsupported when a new major/minor release was available

  • Exceptions can be filed with maintainers and taken into consideration on a case-by-case basis.

Removal of v1alpha3 & v1alpha4 apiVersions

Both v1alpha3 and v1alpha4 have been removed from Cluster API as of release 1.7.

For more details and latest information please see the following issue: Removing v1alpha3 & v1alpha4 apiVersions.

Note: Removal of a deprecated APIVersion in Kubernetes can cause issues with garbage collection by the kube-controller-manager This means that some objects which rely on garbage collection for cleanup - e.g. MachineSets and their descendent objects, like Machines and InfrastructureMachines, may not be cleaned up properly if those objects were created with an APIVersion which is no longer served. To avoid these issues it’s advised to ensure a restart to the kube-controller-manager is done after upgrading to a version of Cluster API which drops support for an APIVersion - e.g. v1.5 and v1.6. This can be accomplished with any Kubernetes control-plane rollout, including a Kubernetes version upgrade, or by manually stopping and restarting the kube-controller-manager.

Contributing a Patch

  1. If you haven’t already done so, sign a Contributor License Agreement (see details above).
  2. If working on an issue, signal other contributors that you are actively working on it using /lifecycle active.
  3. Fork the desired repo, develop and test your code changes.
  4. Submit a pull request.
    1. All code PR must be labeled with one of
      • ⚠️ (:warning:, major or breaking changes)
      • ✨ (:sparkles:, feature additions)
      • 🐛 (:bug:, patch and bugfixes)
      • 📖 (:book:, documentation or proposals)
      • 🌱 (:seedling:, minor or other)
  5. If your PR has multiple commits, you must squash them into a single commit before merging your PR.

Individual commits should not be tagged separately, but will generally be assumed to match the PR. For instance, if you have a bugfix in with a breaking change, it’s generally encouraged to submit the bugfix separately, but if you must put them in one PR, mark the commit separately.

All changes must be code reviewed. Coding conventions and standards are explained in the official developer docs. Expect reviewers to request that you avoid common go style mistakes in your PRs.

Documentation changes

The documentation is published in form of a book at:

The source for the book is this folder containing markdown files and we use mdBook to build it into a static website.

After making changes locally you can run make serve-book which will build the HTML version and start a web server, so you can preview if the changes render correctly at http://localhost:3000; the preview auto-updates when changes are detected.

Note: you don’t need to have mdBook installed, make serve-book will ensure appropriate binaries for mdBook and any used plugins are downloaded into hack/tools/bin/ directory.

When submitting the PR remember to label it with the 📖 (:book:) icon.

Releases

Cluster API release process is described in this document.

Proposal process (CAEP)

The Cluster API Enhancement Proposal is the process this project uses to adopt new features, changes to the APIs, changes to contracts between components, or changes to CLI interfaces.

The template, and accepted proposals live under docs/proposals.

  • Proposals or requests for enhancements (RFEs) MUST be associated with an issue.
    • Issues can be placed on the roadmap during planning if there is one or more folks that can dedicate time to writing a CAEP and/or implementing it after approval.
  • A proposal SHOULD be introduced and discussed during the weekly community meetings or on the Kubernetes SIG Cluster Lifecycle mailing list.
  • A proposal in a Google Doc MUST turn into a Pull Request.
  • Proposals MUST be merged and in implementable state to be considered part of a major or minor release.

Triaging issues

Issue triage in Cluster API follows the best practices of the Kubernetes project while seeking balance with the different size of this project.

While the maintainers play an important role in the triage process described below, the help of the community is crucial to ensure that this task is performed timely and be sustainable long term.

PhaseResponsibleWhat is required to move forward
Initial triageMaintainersThe issue MUST have:
- priority/* label
- kind/* label
Triage finalizationEveryoneThere should be consensus on the way forward and enough details for the issue being actionable
Triage finalizationMaintainersThe issue MUST have:
- triage/accepted label
label, plus eventually help or good-first-issue label
ActionableEveryoneContributors volunteering time to do the work and reviewers/approvers bandwidth
The issue being fixed

Please note that:

  • Priority provides an indication to everyone looking at issues.

    • When assigning priority several factors are taken into consideration, including impact on users, relevance for the upcoming releases, maturity of the issue (consensus + completeness).
    • priority/awaiting-more-evidence is used to mark issue where there is not enough info to take a decision for one of the other priorities values.
    • Priority can change over time, and everyone is welcome to provide constructive feedback about updating an issue’s priority.
    • Applying a priority label is not a commitment to execute within a certain time frame, because implementation depends on contributors volunteering time to do the work and on reviewers/approvers bandwidth.
  • Closing inactive issues which are stuck in the “triage” phases is a crucial task for maintaining an actionable backlog. Accordingly, the following automation applies to issues in the “triage” or the “refinement” phase:

    • After 90 days of inactivity, issues will be marked with the lifecycle/stale label
    • After 30 days of inactivity from when lifecycle/stale was applied, issues will be marked with the lifecycle/rotten label
    • After 30 days of inactivity from when lifecycle/rotten was applied, issues will be closed. With this regard, it is important to notice that closed issues are and will always be a highly valuable part of the knowledge base about the Cluster API project, and they will never go away.
    • Note:
      • The automation above does not apply to issues triaged as priority/critical-urgent, priority/important-soon or priority/important-longterm
      • Maintainers could apply the lifecycle/frozen label if they want to exclude an issue from the automation above
      • Issues excluded from the automation above will be re-triaged periodically
  • If you really care about an issue stuck in the “triage” phases, you can engage with the community or try to figure out what is holding back the issue by yourself, e.g.:

    • Issue too generic or not yet actionable
    • Lack of consensus or the issue is not relevant for other contributors
    • Lack of contributors; in this case, finding ways to help and free up maintainers/other contributors time from other tasks can really help to unblock your issues.
  • Issues in the “actionable” state are not subject to the stale/rotten/closed process; however, it is required to re-assess them periodically given that the project change quickly. Accordingly, the following automation applies to issues in the “actionable” phase:

    • After 30 days of inactivity, the triage/accepted label will be removed from issues with priority/critical-urgent
    • After 90 days of inactivity the triage/accepted label will be removed from issues with priority/important-soon
    • After 1 year of inactivity the triage/accepted label will be removed from issues without priority/critical-urgent or priority/important-soon
  • If you really care about an issue stuck in the “actionable” phase, you can try to figure out what is holding back the issue implementation (usually lack of contributors), engage with the community, find ways to help and free up maintainers/other contributors time from other tasks, or /assign the issue and send a PR.

Triaging E2E test failures

When you submit a change to the Cluster API repository as set of validation jobs is automatically executed by prow and the results report is added to a comment at the end of your PR.

Some jobs run linters or unit test, and in case of failures, you can repeat the same operation locally using make test lint [etc..] in order to investigate and potential issues. Prow logs usually provide hints about the make target you should use (there might be more than one command that needs to be run).

End-to-end (E2E) jobs create real Kubernetes clusters by building Cluster API artifacts with the latest changes. In case of E2E test failures, usually it’s required to access the “Artifacts” link on the top of the prow logs page to triage the problem.

The artifact folder contains:

  • A folder with the clusterctl local repository used for the test, where you can find components yaml and cluster templates.
  • A folder with logs for all the clusters created during the test. Following logs/info are available:
    • Controller logs (only if the cluster is a management cluster).
    • Dump of the Cluster API resources (only if the cluster is a management cluster).
    • Machine logs (only if the cluster is a workload cluster)

In case you want to run E2E test locally, please refer to the Testing guide. An overview over our e2e test jobs (and also all our other jobs) can be found in Jobs.

Reviewing a Patch

Reviews

Parts of the following content have been adapted from https://google.github.io/eng-practices/review.

Any Kubernetes organization member can leave reviews and /lgtm a pull request.

Code reviews should generally look at:

  • Design: Is the code well-designed and consistent with the rest of the system?
  • Functionality: Does the code behave as the author (or linked issue) intended? Is the way the code behaves good for its users?
  • Complexity: Could the code be made simpler? Would another developer be able to easily understand and use this code when they come across it in the future?
  • Tests: Does the code have correct and well-designed tests?
  • Naming: Did the developer choose clear names for variable, types, methods, functions, etc.?
  • Comments: Are the comments clear and useful? Do they explain why rather than what?
  • Documentation: Did the developer also update relevant documentation?

See Code Review in Cluster API for a more focused list of review items.

Approvals

Please see the Kubernetes community document on pull requests for more information about the merge process.

  • A PR is approved by one of the project maintainers and owners after reviews.
  • Approvals should be the very last action a maintainer takes on a pull request.

Features and bugs

Open issues to report bugs, or discuss minor feature implementation.

Each new issue will be automatically labeled as needs-triage; after being triaged by the maintainers the label will be removed and replaced by one of the following:

  • triage/accepted: Indicates an issue or PR is ready to be actively worked on.
  • triage/duplicate: Indicates an issue is a duplicate of another open issue.
  • triage/needs-information: Indicates an issue needs more information in order to work on it.
  • triage/not-reproducible: Indicates an issue can not be reproduced as described.
  • triage/unresolved: Indicates an issue that can not or will not be resolved.

For big feature, API and contract amendments, we follow the CAEP process as outlined below.

Experiments

Proof of concepts, code experiments, or other initiatives can live under the exp folder or behind a feature gate.

  • Experiments SHOULD not modify any of the publicly exposed APIs (e.g. CRDs).
  • Experiments SHOULD not modify any existing CRD types outside the experimental API group(s).
  • Experiments SHOULD not modify any existing command line contracts.
  • Experiments MUST not cause any breaking changes to existing (non-experimental) Go APIs.
  • Experiments SHOULD introduce utility helpers in the go APIs for experiments that cross multiple components and require support from bootstrap, control plane, or infrastructure providers.
  • Experiments follow a strict lifecycle: Alpha -> Beta prior to Graduation.
    • Alpha-stage experiments:
      • SHOULD not be enabled by default and any feature gates MUST be marked as ‘Alpha’
      • MUST be associated with a CAEP that is merged and in at least a provisional state
      • MAY be considered inactive and marked as deprecated if the following does not happen within the course of 1 minor release cycle:
        • Transition to Beta-stage
        • Active development towards progressing to Beta-stage
        • Either direct or downstream user evaluation
      • Any deprecated Alpha-stage experiment MAY be removed in the next minor release.
    • Beta-stage experiments:
      • SHOULD be enabled by default, and any feature gates MUST be marked as ‘Beta’
      • MUST be associated with a CAEP that is at least in the experimental state
      • MUST support conversions for any type changes
      • MUST remain backwards compatible unless updates are coinciding with a breaking Cluster API release
      • MAY be considered inactive and marked as deprecated if the following does not happen within the course of 1 minor release cycle:
        • Graduate
        • Active development towards Graduation
        • Either direct or downstream user consumption
      • Any deprecated Beta-stage experiment MAY be removed after being deprecated for an entire minor release.
  • Experiment Graduation MUST coincide with a breaking Cluster API release
  • Experiment Graduation checklist:
    • MAY provide a way to be disabled, any feature gates MUST be marked as ‘GA’
    • MUST undergo a full Kubernetes-style API review and update the CAEP with the plan to address any issues raised
    • CAEP MUST be in an implementable state and is fully up-to-date with the current implementation
    • CAEP MUST define transition plan for moving out of the experimental api group and code directories
    • CAEP MUST define any upgrade steps required for Existing Management and Workload Clusters
    • CAEP MUST define any upgrade steps required to be implemented by out-of-tree bootstrap, control plane, and infrastructure providers.

Breaking Changes

Breaking changes are generally allowed in the main branch, as this is the branch used to develop the next minor release of Cluster API.

There may be times, however, when main is closed for breaking changes. This is likely to happen as we near the release of a new minor version.

Breaking changes are not allowed in release branches, as these represent minor versions that have already been released. These versions have consumers who expect the APIs, behaviors, etc. to remain stable during the lifetime of the patch stream for the minor release.

Examples of breaking changes include:

  • Removing or renaming a field in a CRD
  • Removing or renaming a CRD
  • Removing or renaming an exported constant, variable, type, or function
  • Updating the version of critical libraries such as controller-runtime, client-go, apimachinery, etc.
    • Some version updates may be acceptable, for picking up bug fixes, but maintainers must exercise caution when reviewing.

There may, at times, need to be exceptions where breaking changes are allowed in release branches. These are at the discretion of the project’s maintainers, and must be carefully considered before merging. An example of an allowed breaking change might be a fix for a behavioral bug that was released in an initial minor version (such as v0.3.0).

Dependency Licence Management

Cluster API follows the license policy of the CNCF. This sets limits on which licenses dependencies and other artifacts use. For go dependencies only dependencies listed in the go.mod are considered dependencies. This is in line with how dependencies are reviewed in Kubernetes.

API conventions

This project follows the Kubernetes API conventions. Minor modifications or additions to the conventions are listed below.

Optional vs. Required

  • Status fields MUST be optional. Our controllers are patching selected fields instead of updating the entire status in every reconciliation.

  • If a field is required (for our controllers to work) and has a default value specified via OpenAPI schema, but we don’t want to force users to set the field, we have to mark the field as optional. Otherwise, the client-side kubectl OpenAPI schema validation will force the user to set it even though it would be defaulted on the server-side.

Optional fields have the following properties:

  • An optional field MUST be marked with +optional and include an omitempty JSON tag.
  • Fields SHOULD be pointers if there is a good reason for it, for example:
    • the nil and the zero values (by Go standards) have semantic differences.
      • Note: This doesn’t apply to map or slice types as they are assignable to nil.
    • the field is of a struct type, contains only fields with omitempty and you want to prevent that it shows up as an empty object after marshalling (e.g. kubectl get)

Example

When using ClusterClass, the semantic difference is important when you have a field in a template which will have instance-specific different values in derived objects. Because in this case it’s possible to set the field to nil in the template and then the value can be set in derived objects without being overwritten by the cluster topology controller.

Exceptions

  • Fields in root objects should be kept as scaffolded by kubebuilder, e.g.:

    type Machine struct {
      metav1.TypeMeta   `json:",inline"`
      metav1.ObjectMeta `json:"metadata,omitempty"`
    
      Spec   MachineSpec   `json:"spec,omitempty"`
      Status MachineStatus `json:"status,omitempty"`
    }
    type MachineList struct {
      metav1.TypeMeta `json:",inline"`
      metav1.ListMeta `json:"metadata,omitempty"`
      Items           []Machine `json:"items"`
    }
    
  • Top-level fields in status must always have the +optional annotation. If we want the field to be always visible even if it has the zero value, it must not have the omitempty JSON tag, e.g.:

    • Replica counters like availableReplicas in the MachineDeployment
    • Flags expressing progress in the object lifecycle like infrastructureReady in Machine

CRD additionalPrinterColumns

All our CRD objects should have the following additionalPrinterColumns order (if the respective field exists in the CRD):

  • Namespace (added automatically)
  • Name (added automatically)
  • Cluster
  • Other fields
  • Replica-related fields
  • Phase
  • Age (mandatory field for all CRDs)
  • Version
  • Other fields for -o wide (fields with priority 1 are only shown with -o wide and not per default)

NOTE: The columns can be configured via the kubebuilder:printcolumn annotation on root objects. For examples, please see the ./api package.

Examples:

kubectl get kubeadmcontrolplane
NAMESPACE            NAME                               INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE     VERSION
quick-start-d5ufye   quick-start-ntysk0-control-plane   true          true                   1          1       1                       2m44s   v1.23.3
kubectl get machinedeployment
NAMESPACE            NAME                      CLUSTER              REPLICAS   READY   UPDATED   UNAVAILABLE   PHASE       AGE     VERSION
quick-start-d5ufye   quick-start-ntysk0-md-0   quick-start-ntysk0   1                  1         1             ScalingUp   3m28s   v1.23.3

Google Doc Viewing Permissions

To gain viewing permissions to google docs in this project, please join either the kubernetes-dev or kubernetes-sig-cluster-lifecycle google group.

Issue and Pull Request Management

Anyone may comment on issues and submit reviews for pull requests. However, in order to be assigned an issue or pull request, you must be a member of the Kubernetes SIGs GitHub organization.

If you are a Kubernetes GitHub organization member, you are eligible for membership in the Kubernetes SIGs GitHub organization and can request membership by opening an issue against the kubernetes/org repo.

However, if you are a member of the related Kubernetes GitHub organizations but not of the Kubernetes org, you will need explicit sponsorship for your membership request. You can read more about Kubernetes membership and sponsorship here.

Cluster API maintainers can assign you an issue or pull request by leaving a /assign <your Github ID> comment on the issue or pull request.

Contributors Ladder

New contributors are welcomed to the community by existing members, helped with PR workflow, and directed to relevant documentation and communication channels. We are also committed in helping people willing to do so in stepping up through the contributor ladder and this paragraph describes how we are trying to make this to happen.

As the project adoption increases and the codebase keeps growing, we’re trying to break down ownership into self-driven subareas of interest. Requirements from the Kubernetes community membership guidelines apply for reviewers, maintainers and any member of these subareas. Whenever you meet requisites for taking responsibilities in a subarea, the following procedure should be followed:

  1. Submit a PR.
  2. Propose at community meeting.
  3. Get positive feedback and +1s in the PR and wait one week lazy consensus after agreement.

As of today there are following OWNERS files/Owner groups defining sub areas: