Contract rules for InfraMachine

Infrastructure providers SHOULD implement an InfraMachine resource.

The goal of an InfraMachine resource is to manage the lifecycle of a provider-specific machine instances. These may be physical or virtual instances, and they represent the infrastructure for Kubernetes nodes.

The InfraMachine resource will be referenced by one of the Cluster API core resources, Machine.

The Machine’s controller will be responsible to coordinate operations of the InfraMachine, and the interaction between the Machine’s controller and the InfraMachine resource is based on the contract rules defined in this page.

Once contract rules are satisfied by an InfraMachine implementation, other implementation details could be addressed according to the specific needs (Cluster API is not prescriptive).

Nevertheless, it is always recommended to take a look at Cluster API controllers, in-tree providers, other providers and use them as a reference implementation (unless custom solutions are required in order to address very specific needs).

In order to facilitate the initial design for each InfraMachine resource, a few implementation best practices and infrastructure Provider Security Guidance are explicitly called out in dedicated pages.

Rules (contract version v1beta1)

RuleMandatoryNote
All resources: scopeYes
All resources: TypeMeta and ObjectMetafieldYes
All resources: APIVersion field valueYes
InfraMachine, InfraMachineList resource definitionYes
InfraMachine: provider IDYes
InfraMachine: failure domainNo
InfraMachine: addressesNo
InfraMachine: initialization completedYes
InfraMachine: conditionsNo
InfraMachine: terminal failuresNo
InfraMachineTemplate, InfraMachineTemplateList resource definitionYes
InfraMachineTemplate: support for SSA dry runNoMandatory for ClusterClasses support
Multi tenancyNoMandatory for clusterctl CLI support
Clusterctl supportNoMandatory for clusterctl CLI support
[InfraMachine: pausing]No

Note:

  • All resources refers to all the provider’s resources “core” Cluster API interacts with; In the context of this page: InfraMachine, InfraMachineTemplate and corresponding list types

All resources: scope

All resources MUST be namespace-scoped.

All resources: TypeMeta and ObjectMeta field

All resources MUST have the standard Kubernetes TypeMeta and ObjectMeta fields.

All resources: APIVersion field value

In Kubernetes APIVersion is a combination of API group and version. Special consideration MUST applies to both API group and version for all the resources Cluster API interacts with.

All resources: API group

The domain for Cluster API resources is cluster.x-k8s.io, and infrastructure providers under the Kubernetes SIGS org generally use infrastructure.cluster.x-k8s.io as API group.

If your provider uses a different API group, you MUST grant full read/write RBAC permissions for resources in your API group to the Cluster API core controllers. The canonical way to do so is via a ClusterRole resource with the aggregation label cluster.x-k8s.io/aggregate-to-manager: "true".

The following is an example ClusterRole for a FooMachine resource in the infrastructure.foo.com API group:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
    name: capi-foo-clusters
    labels:
      cluster.x-k8s.io/aggregate-to-manager: "true"
rules:
- apiGroups:
    - infrastructure.foo.com
  resources:
    - foomachines
    - foomachinetemplates
  verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch

Note: The write permissions are required because Cluster API manages InfraMachines generated from InfraMachineTemplates; when using ClusterClass and managed topologies, also InfraMachineTemplates are managed directly by Cluster API.

All resources: version

The resource Version defines the stability of the API and its backward compatibility guarantees. Examples include v1alpha1, v1beta1, v1, etc. and are governed by the Kubernetes API Deprecation Policy.

Your provider SHOULD abide by the same policies.

Note: The version of your provider does not need to be in sync with the version of core Cluster API resources. Instead, prefer choosing a version that matches the stability of the provider API and its backward compatibility guarantees.

Additionally:

Providers MUST set cluster.x-k8s.io/<version> label on the InfraMachine Custom Resource Definitions.

The label is a map from a Cluster API contract version to your Custom Resource Definition versions. The value is an underscore-delimited (_) list of versions. Each value MUST point to an available version in your CRD Spec.

The label allows Cluster API controllers to perform automatic conversions for object references, the controllers will pick the last available version in the list if multiple versions are found.

To apply the label to CRDs it’s possible to use commonLabels in your kustomize.yaml file, usually in config/crd:

commonLabels:
  cluster.x-k8s.io/v1alpha2: v1alpha1
  cluster.x-k8s.io/v1alpha3: v1alpha2
  cluster.x-k8s.io/v1beta1: v1beta1

An example of this is in the Kubeadm Bootstrap provider.

InfraMachine, InfraMachineList resource definition

You MUST define a InfraMachine resource. The InfraMachine resource name must have the format produced by sigs.k8s.io/cluster-api/util/contract.CalculateCRDName(Group, Kind).

Note: Cluster API is using such a naming convention to avoid an expensive CRD lookup operation when looking for labels from the CRD definition of the InfraMachine resource.

It is a generally applied convention to use names in the format ${env}Machine, where ${env} is a, possibly short, name for the environment in question. For example GCPMachine is an implementation for the Google Cloud Platform, and AWSMachine is one for Amazon Web Services.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=foomachines,shortName=foom,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion
// +kubebuilder:subresource:status

// FooMachine is the Schema for foomachines.
type FooMachine struct {
    metav1.TypeMeta `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooMachineSpec `json:"spec,omitempty"`
    Status FooMachineStatus `json:"status,omitempty"`
}

type FooMachineSpec struct {
    // See other rules for more details about mandatory/optional fields in InfraMachine spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

type FooMachineStatus struct {
    // See other rules for more details about mandatory/optional fields in InfraMachine status.
    // Other fields SHOULD be added based on the needs of your provider.
}

For each InfraMachine resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraMachine>List.

// +kubebuilder:object:root=true

// FooMachineList contains a list of foomachines.
type FooMachineList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooMachine `json:"items"`
}

InfraMachine: provider ID

Each Machine needs a provider ID to identify the Kubernetes Node that runs on the machine. Node’s Provider id MUST surface on spec.providerID in the InfraMachine resource.

type FooMachineSpec struct {
    // ProviderID must match the provider ID as seen on the node object corresponding to this machine.
	// For Kubernetes Nodes running on the Foo provider, this value is set by the corresponding CPI component 
	// and it has the format docker:////<vm-name>. 
    // +optional
    ProviderID *string `json:"providerID,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraMachine spec.
    // Other fields SHOULD be added based on the needs of your provider.
}

Once spec.providerID is set on the InfraMachine resource and the [InfraMachine initialization completed], the Cluster controller will surface this info in Machine’s spec.providerID.

InfraMachine: failure domain

In case you are developing an infrastructure provider which has a notion of failure domains where machines should be placed in, the InfraMachine resource MUST comply to the value that exists in the spec.failureDomain field of the Machine (in other words, the InfraMachine MUST be placed in the failure domain specified at Machine level).

Please note, that for allowing a transparent transition from when there was no failure domain support in Cluster API and InfraMachine was authoritative WRT to failure domain placement (before CAPI v0.3.0), Cluster API still supports a deprecated reverse process for failure domain management.

In the deprecated reverse process, the failure domain where the machine should be placed is defined in the InfraMachine’s spec.failureDomain field; the value of this field is then surfaced on the corresponding field at Machine level.

InfraMachine: addresses

Infrastructure provider have the opportunity to surface machines addresses on the InfraMachine resource; this information won’t be used by core Cluster API controller, but it is really useful for operator troubleshooting issues on machines.

In case you want to surface machine’s addresses, you MUST surface them in status.addresses in the InfraMachine resource.

type FooMachineStatus struct {
    // Addresses contains the associated addresses for the machine.
    // +optional
    Addresses []clusterv1.MachineAddress `json:"addresses,omitempty"`

    // See other rules for more details about mandatory/optional fields in InfraMachine status.
    // Other fields SHOULD be added based on the needs of your provider.
}

Each MachineAddress must have a type; accepted types are Hostname, ExternalIP, InternalIP, ExternalDNS or InternalDNS.

Once status.addresses is set on the InfraMachine resource and the [InfraMachine initialization completed], the Machine controller will surface this info in Machine’s status.addresses.

InfraMachine: initialization completed

Each InfraMachine MUST report when Machine’s infrastructure is fully provisioned (initialization) by setting status.ready in the InfraMachine resource.

type FooMachineStatus struct {
    // Ready denotes that the foo machine infrastructure is fully provisioned.
    // +optional
    Ready bool `json:"ready"`
    
    // See other rules for more details about mandatory/optional fields in InfraMachine status.
    // Other fields SHOULD be added based on the needs of your provider.
}

Once status.ready the Machine “core” controller will bubble up this info in Machine’s status.infrastructureReady; Also InfraMachine’s spec.providerID and status.addresses will be surfaced on Machine’s corresponding fields at the same time.

InfraMachine: conditions

According to Kubernetes API Conventions, Conditions provide a standard mechanism for higher-level status reporting from a controller.

Providers implementers SHOULD implement status.conditions for their InfraMachine resource. In case conditions are implemented, Cluster API condition type MUST be used.

If a condition with type Ready exist, such condition will be mirrored in Machine’s InfrastructureReady condition.

Please note that the Ready condition is expected to surface the status of the InfraMachine during its own entire lifecycle, including initial provisioning, the final deletion process, and the period in between these two moments.

See Cluster API condition proposal for more context.

InfraMachine: terminal failures

Each InfraMachine SHOULD report when Machine’s enter in a state that cannot be recovered (terminal failure) by setting status.failureReason and status.failureMessage in the InfraMachine resource.

type FooMachineStatus struct {
    // FailureReason will be set in the event that there is a terminal problem reconciling the FooMachine 
    // and will contain a succinct value suitable for machine interpretation.
    //
    // This field should not be set for transitive errors that can be fixed automatically or with manual intervention,
    // but instead indicate that something is fundamentally wrong with the FooMachine and that it cannot be recovered.
    // +optional
    FailureReason *capierrors.ClusterStatusError `json:"failureReason,omitempty"`
    
    // FailureMessage will be set in the event that there is a terminal problem reconciling the FooMachine
    // and will contain a more verbose string suitable for logging and human consumption.
    //
    // This field should not be set for transitive errors that can be fixed automatically or with manual intervention,
    // but instead indicate that something is fundamentally wrong with the FooMachine and that it cannot be recovered.
    // +optional
    FailureMessage *string `json:"failureMessage,omitempty"`
    
    // See other rules for more details about mandatory/optional fields in InfraMachine status.
    // Other fields SHOULD be added based on the needs of your provider.
}

Once status.failureReason and status.failureMessage are set on the InfraMachine resource, the Machine “core” controller will surface those info in the corresponding fields in Machine’s status.

Please note that once failureReason/failureMessage is set in Machine’s status, the only way to recover is to delete and recreate the Machine (it is a terminal failure).

InfraMachine: pausing

Providers SHOULD implement the pause behaviour for every object with a reconciliation loop. This is done by checking if spec.paused is set on the Cluster object and by checking for the cluster.x-k8s.io/paused annotation on the InfraMachine object.

If implementing the pause behavior, providers SHOULD surface the paused status of an object using the Paused condition: Status.Conditions[Paused].

InfraMachineTemplate, InfraMachineTemplateList resource definition

For a given InfraMachine resource, you MUST also add a corresponding InfraMachineTemplate resources in order to use it when defining set of machines, e.g. MachineDeployments.

The template resource MUST be named as <InfraMachine>Template.

// +kubebuilder:object:root=true
// +kubebuilder:resource:path=foomachinetemplates,scope=Namespaced,categories=cluster-api
// +kubebuilder:storageversion

// FooMachineTemplate is the Schema for the foomachinetemplates API.
type FooMachineTemplate struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec FooMachineTemplateSpec `json:"spec,omitempty"`
}

type FooMachineTemplateSpec struct {
    Template FooMachineTemplateResource `json:"template"`
}

type FooMachineTemplateResource struct {
    // Standard object's metadata.
    // More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
    // +optional
    ObjectMeta clusterv1.ObjectMeta `json:"metadata,omitempty"`
    Spec FooMachineSpec `json:"spec"`
}

NOTE: in this example InfraMachineTemplate’s spec.template.spec embeds FooMachineSpec from InfraMachine. This might not always be the best choice depending of if/how InfraMachine’s spec fields applies to many machines vs only one.

For each InfraMachineTemplate resource, you MUST also add the corresponding list resource. The list resource MUST be named as <InfraMachineTemplate>List.

// +kubebuilder:object:root=true

// FooMachineTemplateList contains a list of FooMachineTemplates.
type FooMachineTemplateList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []FooMachineTemplate `json:"items"`
}

InfraMachineTemplate: support for SSA dry run

When Cluster API’s topology controller is trying to identify differences between templates defined in a ClusterClass and the current Cluster topology, it is required to run Server Side Apply (SSA) dry run call.

However, in case you immutability checks for your InfraMachineTemplate, this can lead the SSA dry run call to errors.

In order to avoid this InfraMachineTemplate MUST specifically implement support for SSA dry run calls from the topology controller.

The implementation requires to use controller runtime’s CustomValidator, available in CR versions >= v0.12.3.

This will allow to skip the immutability check only when the topology controller is dry running while preserving the validation behavior for all other cases.

See the DockerMachineTemplate webhook as a reference for a compatible implementation.

Multi tenancy

Multi tenancy in Cluster API defines the capability of an infrastructure provider to manage different credentials, each one of them corresponding to an infrastructure tenant.

See infrastructure Provider Security Guidance for considerations about cloud provider credential management.

Please also note that Cluster API does not support running multiples instances of the same provider, which someone can assume an alternative solution to implement multi tenancy; same applies to the clusterctl CLI.

See Support running multiple instances of the same provider for more context.

However, if you want to make it possible for users to run multiples instances of your provider, your controller’s SHOULD:

  • support the --namespace flag.
  • support the --watch-filter flag.

Please, read carefully the page linked above to fully understand implications and risks related to this option.

Clusterctl support

The clusterctl command is designed to work with all the providers compliant with the rules defined in the clusterctl provider contract.

Typical InfraMachine reconciliation workflow

A machine infrastructure provider must respond to changes to its InfraMachine resources. This process is typically called reconciliation. The provider must watch for new, updated, and deleted resources and respond accordingly.

As a reference you can look at the following workflow to understand how the typical reconciliation workflow is implemented in InfraMachine controllers:

Machine infrastructure provider activity diagram

Normal resource

  1. If the resource does not have a Machine owner, exit the reconciliation
    1. The Cluster API Machine reconciler populates this based on the value in the Machines‘s spec.infrastructureRef field
  2. If the resource has status.failureReason or status.failureMessage set, exit the reconciliation
  3. If the Cluster to which this resource belongs cannot be found, exit the reconciliation
  4. Add the provider-specific finalizer, if needed
  5. If the associated Cluster‘s status.infrastructureReady is false, exit the reconciliation
    1. Note: This check should not be blocking any further delete reconciliation flows.
    2. Note: This check should only be performed after appropriate owner references (if any) are updated.
  6. If the associated Machine‘s spec.bootstrap.dataSecretName is nil, exit the reconciliation
  7. Reconcile provider-specific machine infrastructure
    1. If any errors are encountered:
      1. If they are terminal failures, set status.failureReason and status.failureMessage
      2. Exit the reconciliation
    2. If this is a control plane machine, register the instance with the provider’s control plane load balancer (optional)
  8. Set spec.providerID to the provider-specific identifier for the provider’s machine instance
  9. Set status.ready to true
  10. Set status.addresses to the provider-specific set of instance addresses (optional)
  11. Set spec.failureDomain to the provider-specific failure domain the instance is running in (optional)
  12. Patch the resource to persist changes

Deleted resource

  1. If the resource has a Machine owner
    1. Perform deletion of provider-specific machine infrastructure
    2. If this is a control plane machine, deregister the instance from the provider’s control plane load balancer (optional)
    3. If any errors are encountered, exit the reconciliation
  2. Remove the provider-specific finalizer from the resource
  3. Patch the resource to persist changes

[InfraMachine: pausing] #inframachine-pausing