Giter Site home page Giter Site logo

Comments (13)

kfox1111 avatar kfox1111 commented on August 19, 2024

Having had some discussions with folks on slack, I think there are some configurations where it may be safe to turn off TOFU with the existing plugin. Maybe we need a new config option for the plugin to make it possible to disable?

from spire.

dfeldman avatar dfeldman commented on August 19, 2024

Ok, so the easy way to do this is just to ignore the reattestation error attempt for aws_iid when a config option is set. That may be the way to go for dev and test. But in the longer term it should be possible to do secure reattestable attestation on AWS:

Here is what I have been thinking. I've been bouncing this idea off various people but have nothing implemented yet.

  • New node attestor called "aws_tpmx". Tpmx stands for "TPM with extended metadata." To distinguish from other node attestors that use TPM alone.
  • First, on the agent side, it would determine the instance ID. This would come from IMDS by default, but you could also override it with a configuration option (if IMDS is unavailable). This instance ID is not trusted by the server.
  • On the server side, it would query the AWS API to determine the EK of the agent's instance id.
  • The server would generate a nonce and ask the agent to sign the nonce with that EK.
  • The server would verify the nonce signature. This proves that the instance ID sent by the agent earlier is correct.
  • The server would set selectors on the node attestor for instance ID, tags, roles, etc. all determined from the AWS API using the instance ID.

The reason for this idea:

  • This allows blocking the IMDS entirely, which is useful for high security environments (and the default for some setups).
  • This exact same approach works on GCP and Azure, which have APIs to get the EK of any instance.
  • The agent is proving that it is the real instance that has the quoted instance ID, AND that it has access to /dev/tpm, which should be only privileged processes running on that instance.
  • The challenge-response with signing a server-generated nonce is reattestable since it cannot be replayed.

AWS PCR4 register unique behavior:

  • In AWS only, instances have a hash of the instance ARN in their TPM PCR4 register.
  • I do not think it is wise to rely on this feature though. First, it is AWS unique behavior. Second, because it's only a hash of the ARN, it cannot be used as an index to look up any details of the instance, only to verify them.

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

I think this plan sounds basically like the boxboat tpm plugin, plus some extra selector work added? Maybe that plugin could be extended with the X functionality on top?

from spire.

dfeldman avatar dfeldman commented on August 19, 2024

The Boxboat plugin is a good starting point, but I'm fairly sure it doesn't have the step where it uses the AWS API to fetch the EK. (That functionality in AWS is actually very new -- less than 1 month old).

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

Right. But that is probably an extension to what the plugin is already doing. like, if you could take the boxboat server plugin, and extend the class with an extra set of calls to the end of attestation, after it has done the tpm part, to add the cloud attestation/selectors, that might simplify the implementation?

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

I guess what I'm saying is, maybe a lot of the boxboat plugin could be used to implement the other plugins being discussed rather then start a new one, if done carefully. Maybe a good reason to bring the boxboat plugin in tree.

from spire.

dfeldman avatar dfeldman commented on August 19, 2024

Makes sense. The code would be largely common between AWS, GCP, and Azure versions of this node attestor.

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

I could even see it being used with other plugins as well. maybe an openstack one, or a vmware one, as they get their own equivalents.

from spire.

azdagron avatar azdagron commented on August 19, 2024

However, we noticed that whenever the spire-agent crashes/restarted(or the instance itself is restarted), there is no way to re-attest the agents

What KeyManager are your agents using? If the agent is able to persist it's key, then it shouldn't need to re-attest on startup (assuming its SVID has not expired.... which would happen if the agent was down for a long time).

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

I do like the idea of having cloud providers that have to periodically re-prove the vm is still where it is via reattestation. Regardless if TOFU can be made to stick for this specific case.

from spire.

ranjit-se7en avatar ranjit-se7en commented on August 19, 2024

However, we noticed that whenever the spire-agent crashes/restarted(or the instance itself is restarted), there is no way to re-attest the agents

What KeyManager are your agents using? If the agent is able to persist it's key, then it shouldn't need to re-attest on startup (assuming its SVID has not expired.... which would happen if the agent was down for a long time).

We are using the KeyManager disk plugin, and no the agents don't re-attest while SVID is not expired.

from spire.

azdagron avatar azdagron commented on August 19, 2024

I think we'd be fairly hesitant to just add a configuration to the plugin that disabled TOFU. This configuration would be too easy to reach for when facing friction with agent eviction. Disabling TOFU should probably only be done with extreme scrutiny.

Are there situations where SPIRE itself could determine that it was safe to disable TOFU?

Having had some discussions with folks on slack, I think there are some configurations where it may be safe to turn off TOFU with the existing plugin. Maybe we need a new config option for the plugin to make it possible to disable?

@kfox1111, what configurations do you know about?

from spire.

kfox1111 avatar kfox1111 commented on August 19, 2024

Thinking we might have a new option to have a unix socket based proxy to the aws_iid spire-server can use.

In a kubernetes environment, they are blocking access to the iid by any non hostNetwork: true pod. So, pretty safe. Could probably verify this by just trying to contact it directly from the spire-server.

All the spire-agents are running hostNetwork: true so should be able to access iid. So, nothing special to do there.

We could have a daemonset for a small proxy to make iid available to the servers that are locked down via unix socket. Along with a csi instance making it available to spire-servers. The existing spiffe driver should work without modifications for that. Then the server does not have to run hostNetwork=true and require a lot more permissions in kubernetes.

With aws gaining tpm support, tpms could be switched out for iid on the agent side for better security yet. The spire-server will still need iid access though I think to verify the tpms.

from spire.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.