-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC4256: RFC 9420 MLS mode Matrix #4256
base: main
Are you sure you want to change the base?
Conversation
|
||
![Invite flow][invite-flow] | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about display name and avatar url of members? Are they stored in the MLS group context too? And if so, how to prevent malicious users from changing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, there will be a section on this here.
7. If the `powers` has entries not in `servers`, reject | ||
8. If the `powers` start not with exactly the same entries in the same order as the subset of | ||
entries in the previous `powers` above the `origin` , reject | ||
9. If the `can_propose` has entries not in `servers`, reject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we need to enforce that entries from powers
also appear in can_propose
to enable the process described in "Commit handling with multiple servers" below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically that is optional, since events can either be sent as a proposed commit or directly. Ideally every server can propose though, although the first power server really doesn't need to.
Currently redactions are not supported in the outer protocol layer. This has the benefit of making | ||
redactions invisible to the server, but also prevents any server side aggregation. This also means | ||
content can’t be deleted. As we require all events to be encrypted, this might not be a major | ||
problem, but policy makers might disagree. Redactions could be supported in various ways and we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redactions are not only relevant in the context of data privacy but also for moderation. Not being able to delete malicious or illegal content seems like a significant problem. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only read through part of this, but here are some (mostly nitpicky) comments so far. Generally, there's inconsistency between using "mls" and "MLS". It would be nice to use "MLS" everywhere.
encrypted events with only an `algorithm` and `ciphertext` field. It contains the current MLS | ||
commit as a MLS private message. | ||
|
||
The MLS commit contains a CBOR encoded field in its unencrypted authenticated data of the MLS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give some rationale here for why you're using CBOR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this CBOR-encoded data the only thing in the authenticated data, or are there other things that are in there alongside it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON would have some overhead when putting it directly into the bytes here, so CBOR seemed like the more natural fit, but it isn't really a requirement. We can change that.
Currently there are no other fields in the authenticated data. There is a way to structure extensions using authenticated data, that we aren't using here yet, which could be more appropriate? But so far we decided to go with the simplest approach.
4. Considering the events auth events: | ||
1. If there is not exactly one `m.room.create` event (with implicitly an empty state key), reject | ||
2. If the event is not the `m.room.create` event (we never get here) or an `m.mls.commit` event | ||
with epoch == 0 (we do get here), then if there is no `m.mls.commit` event in the auth events, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anywhere in this MSC that mentions where the epoch comes from. I think it may be helpful to mention something about that somewhere, since this MSC will be read by people not completely familiar with MLS. (It comes from the ciphertext
, which despite being called "ciphertext", contains some cleartext bits, including the epoch. Which means that servers will be expected to be able to parse the MLS ciphertext, at least enough to read the epoch, group ID, and anything else that's needed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, exactly what they said and we should put that somewhere
6. If the `powers` has entries not in `servers`, reject | ||
7. If the `can_propose` has entries not in `servers`, reject | ||
8. Otherwise, allow | ||
11. If the epoch of the `ciphertext` is not the epoch of the current `m.mls.commit` event + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the epoch of the ciphertext be equal to the epoch of the current m.mls.commit
event (which I assume means the m.mls.commit
event that is referenced in the event's auth_events)? Otherwise you're encrypting with an epoch that hasn't been created yet?
|
||
State resolution might result in some clients losing access to messages as they threw their private | ||
state away immediately to provide PCS and forward secrecy. In that case the clients dropped by the | ||
state event will have to be sent another welcome to rejoin them into the room. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way for clients to ask to be re-added? How do others know that the client needs to be re-added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A client can be re-added by reinviting them. Usually there is no way to know this for certain, but clients can likely guess a client got dropped from the room by looking at the results of state resolution. But that is unreliably to do over federation, as you can't always know what fork a room ended up on on that server.
Having conflicting commits will lead to some undecryptable messages. For short durations that can be | ||
mitigated by keeping the old state around for seconds to minutes. This is the common case when | ||
people invite new users at the same time or do otherwise racing modifications on different servers. | ||
For long durations no automatic recovery is possible and manual communication is necessary. We |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this "manual communication" would entail? Are users going to have to talk to each other out-of-band and poke at the cryptographic state in their clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the users must exchange information out of band. This can be by email, face to face or another intact or private room or similar.
|
||
As no member events are used in this room version, the direct /state endpoint can not be used to | ||
change membership in a room | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are user's new devices/deleted devices handled? Who is in charge of adding/removing them from the group? How do people leave the group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can add their own devices as part of the verification process during login or other users might propose adding them, when they notice the new device and can verify it.
For leaving users can propose a self removal (if that extension is enabled) or ask someone else to remove them from the group via normal communication in the room.
|
||
### Room state | ||
|
||
Currently all state is stored in one GroupContext extension. This has obvious size limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, I believe that this means that having the state in the GroupContext means that the entire state will be encoded in every single commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's a known issue and can be resolved with MLS extensions. Either a custom extension that only sends a diff, or by using mechanisms from the extensions draft.
However if there is no state change, no proposal for that needs to be sent, so it also isn't included in the commit.
|
||
There are several options for how a client can send a new commit into a room. | ||
|
||
* Users on the first server listed in the "powers" list can send commit state events directly. They |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if multiple people on that first server send a commit state event at the same time? Does he server just select one to accept and return an error to the others? If so, what error code does it use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the regular MLS conflict case. The server selects one based on some criteria (e.g. the one that came first) and sends a rejection message to sender of the other commit.
That needs of course a message. But I don't think we defined the error messages/codes for any of the failure cases yet.
|
||
![Invite flow][invite-flow] | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the possible delay in applying a commit due to "powers" servers being offline is a significant potential issue that should be listed here. For example, if the first server listed in "powers" is offline, then when you invite someone, it will take 5 minutes before they can actually be in the room. (Could they end up being halfway in the room, if the inviter closes their client after they try to invite, but before they see the commit event?)
This, combined with encrypted state, may also cause problems for MatrixRTC, which IIRC uses various state events. If it takes over 5 minutes to set up a call, or for someone to join a call, that could be problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This MSC expects that the servers in the powers list are online.
people invite new users at the same time or do otherwise racing modifications on different servers. | ||
For long durations no automatic recovery is possible and manual communication is necessary. We | ||
expect those cases to be rare even in larger federations. Long term server outages usually don’t | ||
include membership modifications at the same time as usually one server is completely offline. Even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Membership modifications aren't the only reasons for sending commits, though. Group members are expected to periodically send commits (or Updates which are committed by others), which will probably be more frequent than membership changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no requirement to regularly send commits.
However, it's recommended for security. But that's a membership modification as well because updating the own leaf nodes changes the leaf node and such the membership.
But maybe it should be made clear here that any change of leaf nodes in the tree is meant.
* Servers not listed as the first entry in the "powers" list should send a "m.mls.pending_commit" | ||
as a normal message. This will then be applied according to the sorting rules of that server, or | ||
according to the rules in the following section, and a commit state event will be sent by one of the | ||
homeservers listed in the "powers" list. Only users on servers in the "can_propose" list can send |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This nearly tripped me the first time I read it, so a note for future readers: in the current existing room versions, if a server wants to send an event to a room, it needs to use a user ID of a room member that is on that server to send the event. However, in rooms covered by this MSC, there is no sender
field, so servers can send events on their own, without needing to invoke a user ID.
user to the MLS group. The key packages should be validated and should be cross-signed as configured | ||
in the additional authenticated data of the current MLS commit. | ||
After this the commit in the room is updated. Once this commit is accepted by the delivery service, | ||
the inviting user should send out the welcome message as a to_device message and send an invite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any definition of how the Welcome message is sent as a to-device event (e.g. what event type it uses, what properties the event has). Can you add that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are just normal encrypted events:
{
"type": "m.room.encrypted",
"content": {
"algorithm": "m.mls.v1....",
"ciphertext": "<encrypted_payload_unpadded_base_64>"
}
}
user to the MLS group. The key packages should be validated and should be cross-signed as configured | ||
in the additional authenticated data of the current MLS commit. | ||
After this the commit in the room is updated. Once this commit is accepted by the delivery service, | ||
the inviting user should send out the welcome message as a to_device message and send an invite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the impact if the invite was not sent, for example due to a buggy or malicious client, or due to a network error? It seems like the MLS membership list and Network membership list would get out of sync, which could be problematic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, is there any risk of a buggy client sending an invite without having sent a welcome message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you get a welcome message but no invite, you can just ignore it. It is like getting a megolm key for a room you are not in. If a user sends out an invite without a welcome, then you need to talk to them and fix the bug or put them on your ignore list.
The invited user will then receive the invite and welcome message. If the user wants to join the | ||
room, the client can send a request to the /join endpoint. | ||
|
||
As no member events are used in this room version, the direct /state endpoint can not be used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that member events are not used, how does network membership (invites, joins, etc) propagate between the homeservers that are participating in the room?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the local server needs to know which of its users are in the room. Invites and joins are handled using the existing federation APIs for invites and joins, but other servers never get told about those users, they only know which servers are participating.
"powers" list. | ||
|
||
Servers have a 5 minute conflict window to apply `m.mls.proposed_commit` they receive. The | ||
highest power server can apply the commit in the first 3 minutes after the `origin_server_ts` of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not subject to manipulation by a server wishing to put itself in charge of applying commits, by falsifying origin_server_ts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case the higher power server could still send a conflicting event, which would win during state.
Clients can know their `m.mls.proposed_commit` failed to be applied if they haven’t received a | ||
`m.mls.commit` event matching their `m.mls.proposed_commit` after "len(powers)*5" minutes. | ||
|
||
State resolution might result in some clients losing access to messages as they threw their private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this section is that a netsplit of any non-trivial duration can lead to different commits being applied on each side of the split; when the split is eventually resolved, this will result in one side "losing" and any clients on the losing side will be effectively excluded from the room, since they will have discarded the necessary MLS state.
If that's correct, it feels akin to the "state reset" problems that current Matrix can suffer from, but much worse.
This seems very dangerous to me. IMHO it would be better not to allow MLS commits outside the current highest power server at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is expected. If you want to prevent such conflicts, you should only put a single server into the power list, which achieves the same result as only allowing commits from the highest power server. The option to add more servers in the powers list only exists for the cases where people expect servers to go offline eventually (like in smaller communities, where a friend decides to quit Matrix).
Rendered
Server/Synapse implementation:
Client implementation:
Related to: