Contract rules for InfraCluster

Infrastructure providers SHOULD implement an InfraCluster resource.

The goal of an InfraCluster resource is to supply whatever prerequisites (in term of infrastructure) are necessary for running machines. Examples might include networking, load balancers, firewall rules, and so on.

The InfraCluster resource will be referenced by one of the Cluster API core resources, Cluster.

The Cluster’s controller will be responsible to coordinate operations of the InfraCluster, and the interaction between the Cluster’s controller and the InfraCluster resource is based on the contract rules defined in this page.

Once contract rules are satisfied by an InfraCluster implementation, other implementation details could be addressed according to the specific needs (Cluster API is not prescriptive).

Nevertheless, it is always recommended to take a look at Cluster API controllers, in-tree providers, other providers and use them as a reference implementation (unless custom solutions are required in order to address very specific needs).

In order to facilitate the initial design for each InfraCluster resource, a few implementation best practices and infrastructure Provider Security Guidance are explicitly called out in dedicated pages.

Rules (contract version v1beta1)

RuleMandatoryNote
All resources: scopeYes
All resources: TypeMeta and ObjectMetafieldYes
All resources: APIVersion field valueYes
InfraCluster, InfraClusterList resource definitionYes
InfraCluster: control plane endpointNoMandatory if control plane endpoint is not provided by other means.
InfraCluster: failure domainsNo
InfraCluster: initialization completedYes
InfraCluster: conditionsNo
InfraCluster: terminal failuresNo
InfraClusterTemplate, InfraClusterTemplateList resource definitionNoMandatory for ClusterClasses support
Externally managed infrastructureNo
Multi tenancyNoMandatory for clusterctl CLI support
Clusterctl supportNoMandatory for clusterctl CLI support
InfraCluster: pausingNo

Note:

  • All resources refers to all the provider’s resources “core” Cluster API interacts with; In the context of this page: InfraCluster, InfraClusterTemplate and corresponding list types

All resources: scope

All resources MUST be namespace-scoped.

All resources: TypeMeta and ObjectMeta field

All resources MUST have the standard Kubernetes TypeMeta and ObjectMeta fields.

All resources: APIVersion field value

In Kubernetes APIVersion is a combination of API group and version. Special consideration MUST applies to both API group and version for all the resources Cluster API interacts with.

All resources: API group

The domain for Cluster API resources is cluster.x-k8s.io, and infrastructure providers under the Kubernetes SIGS org generally use infrastructure.cluster.x-k8s.io as API group.

If your provider uses a different API group, you MUST grant full read/write RBAC permissions for resources in your API group to the Cluster API core controllers. The canonical way to do so is via a ClusterRole resource with the aggregation label cluster.x-k8s.io/aggregate-to-manager: "true".

The following is an example ClusterRole for a FooCluster resource in the infrastructure.foo.com API group:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
    name: capi-foo-clusters
    labels:
      cluster.x-k8s.io/aggregate-to-manager: "true"
rules:
- apiGroups:
    - infrastructure.foo.com
  resources:
    - fooclusters
  verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
- apiGroups:
    - infrastructure.foo.com
  resources:
    - fooclustertemplates
  verbs:
    - get
    - list
    - patch
    - update
    - watch

Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster resources; write permissions are not used for general mutations of InfraCluster resources, unless specifically required (e.g. when using ClusterClass and managed topologies).

All resources: version

The resource Version defines the stability of the API and its backward compatibility guarantees. Examples include v1alpha1, v1beta1, v1, etc. and are governed by the Kubernetes API Deprecation Policy.

Your provider SHOULD abide by the same policies.

Note: The version of your provider does not need to be in sync with the version of core Cluster API resources. Instead, prefer choosing a version that matches the stability of the provider API and its backward compatibility guarantees.

Additionally:

Providers MUST set cluster.x-k8s.io/<version> label on the InfraCluster Custom Resource Definitions.

The label is a map from a Cluster API contract version to your Custom Resource Definition versions. The value is an underscore-delimited (_) list of versions. Each value MUST point to an available version in your CRD Spec.

The label allows Cluster API controllers to perform automatic conversions for object references, the controllers will pick the last available version in the list if multiple versions are found.

To apply the label to CRDs it’s possible to use commonLabels in your kustomize.yaml file, usually in config/crd:

commonLabels:
  cluster.x-k8s.io/v1alpha2: v1alpha1
  cluster.x-k8s.io/v1alpha3: v1alpha2
  cluster.x-k8s.io/v1beta1: v1beta1

An example of this is in the Kubeadm Bootstrap provider.

InfraCluster, InfraClusterList resource definition

You MUST define a InfraCluster resource. The InfraCluster resource name must have the format produced by sigs.k8s.io/cluster-api/util/contract.CalculateCRDName(Group, Kind).

Note: Cluster API is using such a naming convention to avoid an expensive CRD lookup operation when looking for labels from the CRD definition of the InfraCluster resource.

It is a generally applied convention to use names in the format ${env}Cluster, where ${env} is a, possibly short, name for the environment in question. For example GCPCluster is an implementation for the Google Cloud Platform, and AWSCluster is one for Amazon Web Services.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=fooclusters,shortName=foocl,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion
// +kubebuilder:subresource:status

// FooCluster is the Schema for fooclusters.
type FooCluster struct {
    metav1.TypeMeta `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooClusterSpec `json:"spec,omitempty"`
    Status FooClusterStatus `json:"status,omitempty"`
}

type FooClusterSpec struct {
    // See other rules for more details about mandatory/optional fields in InfraCluster spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

type FooClusterStatus struct {
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

For each InfraCluster resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraCluster>List.

// +kubebuilder:object:root=true

// FooClusterList contains a list of fooclusters.
type FooClusterList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooCluster `json:"items"`
}

InfraCluster: control plane endpoint

Each Cluster needs a control plane endpoint to sit in front of control plane machines. Control plane endpoint can be provided in three ways in Cluster API: by the users, by the control plane provider or by the infrastructure provider.

In case you are developing an infrastructure provider which is responsible to provide a control plane endpoint for each Cluster, the host and port of the generated control plane endpoint MUST surface on spec.controlPlaneEndpoint in the InfraCluster resource.

type FooClusterSpec struct {
    // ControlPlaneEndpoint represents the endpoint used to communicate with the control plane.
    // +optional
    ControlPlaneEndpoint APIEndpoint `json:"controlPlaneEndpoint"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

// APIEndpoint represents a reachable Kubernetes API endpoint.
type APIEndpoint struct {
    // The hostname on which the API server is serving.
    Host string `json:"host"`
    
    // The port on which the API server is serving.
    Port int32 `json:"port"`
}

Once spec.controlPlaneEndpoint is set on the InfraCluster resource and the [InfraCluster initialization completed], the Cluster controller will surface this info in Cluster’s spec.controlPlaneEndpoint.

If instead you are developing an infrastructure provider which is NOT responsible to provide a control plane endpoint, the implementer should exit reconciliation until it sees Cluster’s spec.controlPlaneEndpoint populated.

InfraCluster: failure domains

In case you are developing an infrastructure provider which has a notion of failure domains where machines should be placed in, the list of available failure domains MUST surface on status.failureDomains in the InfraCluster resource.

type FooClusterStatus struct {
    // FailureDomains is a list of failure domain objects synced from the infrastructure provider.
    FailureDomains clusterv1.FailureDomains `json:"failureDomains,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

clusterv1.FailureDomains is a map, defined as map[string]FailureDomainSpec. A unique key must be used for each FailureDomainSpec. FailureDomainSpec is defined as:

  • controlPlane bool: indicates if failure domain is appropriate for running control plane instances.
  • attributes map[string]string: arbitrary attributes for users to apply to a failure domain.

Once status.failureDomains is set on the InfraCluster resource and the [InfraCluster initialization completed], the Cluster controller will surface this info in Cluster’s status.failureDomains.

InfraCluster: initialization completed

Each InfraCluster MUST report when Cluster’s infrastructure is fully provisioned (initialization) by setting status.ready in the InfraCluster resource.

type FooClusterStatus struct {
    // Ready denotes that the foo cluster infrastructure is fully provisioned.
    // +optional
    Ready bool `json:"ready"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

Once status.ready the Cluster “core” controller will bubbles up this info in Cluster’s status.infrastructureReady; If defined, also InfraCluster’s spec.controlPlaneEndpoint and status.failureDomains will be surfaced on Cluster’s corresponding fields at the same time.

InfraCluster: conditions

According to Kubernetes API Conventions, Conditions provide a standard mechanism for higher-level status reporting from a controller.

Providers implementers SHOULD implement status.conditions for their InfraCluster resource. In case conditions are implemented, Cluster API condition type MUST be used.

If a condition with type Ready exist, such condition will be mirrored in Cluster’s InfrastructureReady condition.

Please note that the Ready condition is expected to surface the status of the InfraCluster during its own entire lifecycle, including initial provisioning, the final deletion process, and the period in between these two moments.

See Cluster API condition proposal for more context.

InfraCluster: pausing

Providers SHOULD implement the pause behaviour for every object with a reconciliation loop. This is done by checking if spec.paused is set on the Cluster object and by checking for the cluster.x-k8s.io/paused annotation on the InfraCluster object.

If implementing the pause behavior, providers SHOULD surface the paused status of an object using the Paused condition: Status.Conditions[Paused].

InfraCluster: terminal failures

Each InfraCluster SHOULD report when Cluster’s enter in a state that cannot be recovered (terminal failure) by setting status.failureReason and status.failureMessage in the InfraCluster resource.

type FooClusterStatus struct {
    // FailureReason will be set in the event that there is a terminal problem reconciling the FooCluster 
    // and will contain a succinct value suitable for machine interpretation.
    //
    // This field should not be set for transitive errors that can be fixed automatically or with manual intervention,
    // but instead indicate that something is fundamentally wrong with the FooCluster and that it cannot be recovered.
    // +optional
    FailureReason *capierrors.ClusterStatusError `json:"failureReason,omitempty"`
    
    // FailureMessage will be set in the event that there is a terminal problem reconciling the FooCluster
    // and will contain a more verbose string suitable for logging and human consumption.
    //
    // This field should not be set for transitive errors that can be fixed automatically or with manual intervention,
    // but instead indicate that something is fundamentally wrong with the FooCluster and that it cannot be recovered.
    // +optional
    FailureMessage *string `json:"failureMessage,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraCluster status.
    // Other fields SHOULD be added based on the needs of your provider.
}

Once status.failureReason and status.failureMessage are set on the InfraCluster resource, the Cluster “core” controller will surface those info in the corresponding fields in Cluster’s status.

Please note that once failureReason/failureMessage is set in Cluster’s status, the only way to recover is to delete and recreate the Cluster (it is a terminal failure).

InfraClusterTemplate, InfraClusterTemplateList resource definition

For a given InfraCluster resource, you should also add a corresponding InfraClusterTemplate resources in order to use it in ClusterClasses. The template resource MUST be named as <InfraCluster>Template.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=fooclustertemplates,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion

// FooClusterTemplate is the Schema for the fooclustertemplates API.
type FooClusterTemplate struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec FooClusterTemplateSpec `json:"spec,omitempty"`
}

type FooClusterTemplateSpec struct {
    Template FooClusterTemplateResource `json:"template"`
}

type FooClusterTemplateResource struct {
    // Standard object's metadata.
    // More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
    // +optional
    ObjectMeta clusterv1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooClusterSpec `json:"spec"`
}

NOTE: in this example InfraClusterTemplate’s spec.template.spec embeds FooClusterSpec from InfraCluster. This might not always be the best choice depending of if/how InfraCluster’s spec fields applies to many clusters vs only one.

For each InfraClusterTemplate resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraClusterTemplate>List.

// +kubebuilder:object:root=true

// FooClusterTemplateList contains a list of FooClusterTemplates.
type FooClusterTemplateList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooClusterTemplate `json:"items"`
}

Externally managed infrastructure

In some cases, users might be required (or choose to) manage infrastructure out of band and run CAPI on top of already existing infrastructure.

In order to support this use case, the InfraCluster controller SHOULD skip reconciliation of InfraCluster resources with the cluster.x-k8s.io/managed-by: "<name-of-system>" label, and not update the resource or its status in any way.

Please note that when the cluster infrastructure is externally managed, it is responsibility of external management system to abide to the following contract rules:

  • [InfraCluster control plane endpoint]
  • [InfraCluster failure domains]
  • [InfraCluster initialization completed]
  • [InfraCluster terminal failures]

See the externally managed infrastructure proposal for more detail about this use case.

Multi tenancy

Multi tenancy in Cluster API defines the capability of an infrastructure provider to manage different credentials, each one of them corresponding to an infrastructure tenant.

See infrastructure Provider Security Guidance for considerations about cloud provider credential management.

Please also note that Cluster API does not support running multiples instances of the same provider, which someone can assume an alternative solution to implement multi tenancy; same applies to the clusterctl CLI.

See Support running multiple instances of the same provider for more context.

However, if you want to make it possible for users to run multiples instances of your provider, your controller’s SHOULD:

  • support the --namespace flag.
  • support the --watch-filter flag.

Please, read carefully the page linked above to fully understand implications and risks related to this option.

Clusterctl support

The clusterctl command is designed to work with all the providers compliant with the rules defined in the clusterctl provider contract.

Typical InfraCluster reconciliation workflow

A cluster infrastructure provider must respond to changes to its InfraCluster resources. This process is typically called reconciliation. The provider must watch for new, updated, and deleted resources and respond accordingly.

As a reference you can look at the following workflow to understand how the typical reconciliation workflow is implemented in InfraCluster controllers:

Cluster infrastructure provider activity diagram

Normal resource

  1. If the resource is externally managed, exit the reconciliation
    1. The ResourceIsNotExternallyManaged predicate can be used to prevent reconciling externally managed resources
  2. If the resource does not have a Cluster owner, exit the reconciliation
    1. The Cluster API Cluster reconciler populates this based on the value in the Cluster‘s spec.infrastructureRef field.
  3. Add the provider-specific finalizer, if needed
  4. Reconcile provider-specific cluster infrastructure
    1. If any errors are encountered, exit the reconciliation
  5. If the provider created a load balancer for the control plane, record its hostname or IP in spec.controlPlaneEndpoint
  6. Set status.ready to true
  7. Set status.failureDomains based on available provider failure domains (optional)
  8. Patch the resource to persist changes

Deleted resource

  1. If the resource has a Cluster owner
    1. Perform deletion of provider-specific cluster infrastructure
    2. If any errors are encountered, exit the reconciliation
  2. Remove the provider-specific finalizer from the resource
  3. Patch the resource to persist changes