Skip to content

Policy for disabling SMT #181

@bgilbert

Description

@bgilbert

There have been multiple rounds of CPU vulnerabilities (L1TF and MDS) which cannot be completely mitigated without disabling Simultaneous Multi-Threading on affected processors. Disabling SMT reduces system performance and changes the apparent number of processors on the system, which might confuse container orchestrators and the like, so it's difficult to do without warning in existing OSes -- especially on an upgrade of existing nodes. (Container Linux opted to leave SMT enabled but advise users to turn it off.) With Fedora CoreOS, we have the opportunity to choose a different default.

The kernel supports two relevant command-line arguments:

  • mitigations=auto,nosmt disables SMT when needed to mitigate a vulnerability.
  • nosmt unconditionally disables SMT.

There are also ways to disable SMT from userspace, but they're not connected to the kernel's vulnerability-detection logic. The only way to do conditional enablement is via the karg.

Proposal

Set mitigations=auto,nosmt by default on Fedora CoreOS.

Good parts:

  • Better security by default
  • Does not incur an unnecessary performance hit on CPUs that don't need it

Bad parts:

  • Requires passing a karg, which isn't trivial to change via Ignition (though see Support default kernel arguments ostreedev/ostree#479)
  • Causes a performance hit on vulnerable systems running workloads that aren't affected by the vulnerability, such as trusted workloads
  • Does not protect systems with unknown SMT vulnerabilities, until those vulnerabilities become known and a kernel update is pushed out
  • Updates to address future vulnerabilities can substantially reduce the performance of a node, as well as its apparent number of processors. I think this is a reasonable policy for a new OS, but we'd certainly need to document it.

Regardless of what we choose, we should document the default behavior and the fact that the user can change it.

Alternative defaults

  • Unconditionally disable SMT. This provides predictable machine performance over the long term, but also penalizes processor families that have not had any public vulnerabilities.
  • Leave SMT enabled. This leaves users exposed to known vulnerabilities, which is unsatisfactory.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions