Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge Update #2 #9

Open
wants to merge 38 commits into
base: master
Choose a base branch
from
Open

Huge Update #2 #9

wants to merge 38 commits into from

Conversation

LeonMatthes
Copy link
Contributor

@LeonMatthes LeonMatthes commented Jul 26, 2020

Implement the changes requested by @futile on #6 .

Additionally, the PeerID is now a separate type, so people can't just pass in arbitrary numbers into the peer indexing.

I chose for now to keep Index and IndexMut in Host. As now the primary way to access a peer is through the host via its PeerID, indexing is just a lot more convenient (see examples/client.rs).
If you don't like this, I can of course remove the impl's again.

I decided not to touch the constructors of Peer, it does feel quite hacky, but also cleans up the code a fair bit...

htrefil and others added 30 commits January 8, 2020 12:36
Note: The old implementation was actually faulty and caused a SegFault,
as the Vec was dropped, but it's memory not leaked to ENet, this is
fixed now using Box::leak().
Event now no longer requires a lifetime, however replying directly to
the message of a peer is a bit more cumbersome.
@SeanTolstoyevski
Copy link

what is last status?

@949f45ac
Copy link

949f45ac commented Mar 2, 2022

You might want to take a look at the moduleverse branch of my fork.

Hey @LeonMatthes , thanks for your work on this!
While the repo maintainer is unresponsive, would you consider publishing your fork on crates.io under a different name, so that it can more conveniently be included in projects?
Kind regards

@LeonMatthes
Copy link
Contributor Author

LeonMatthes commented Mar 15, 2022

Hey @LeonMatthes , thanks for your work on this! While the repo maintainer is unresponsive, would you consider publishing your fork on crates.io under a different name, so that it can more conveniently be included in projects? Kind regards

I'd prefer to not pollute the crates.io namespace if possible.
For now, you can just add this to your Cargo.toml:

enet={git="https://github.com/LeonMatthes/enet-rs.git", branch="moduleverse"}

Which should be easy enough 😊

@futile
If you don't want to continue supporting this repository, you could give me maintainer access to this crate on crates.io and I could continue work on my fork.

@JohnDowson
Copy link

While we're at it, Connect is supposed to have data just like Disconnect does.

@LeonMatthes
Copy link
Contributor Author

@futile Great to see you're back :)

Unfortunately this PR has now deviated too far from your work for a rebase to be feasible. (I'm trying right now, but it's just way too cumbersome).
I can merge your master branch into this PR though if you're still interested in these changes.

I know it's a large PR, but I think this really "rounds things off" nicely.
If you want we could also do a zoom call or similar, so we could go through the changes quickly.

This fork is still working great in my own project, over 1 year of use now.
The best thing I have to say about it is that I didn't have to touch the code again since creating this PR.
Which probably also means the code is likely to be easy to maintain.

@LeonMatthes
Copy link
Contributor Author

While we're at it, Connect is supposed to have data just like Disconnect does.

Probably best to keep stuff like that in a separate PR, so this PR doesn't balloon any further.

@JohnDowson
Copy link

Probably. I guess I'll wait until this gets merged and then push my patches.

@futile
Copy link
Owner

futile commented Mar 31, 2022

Hey everyone, and sorry for the long silence! :) And thanks a lot for your contributions and discussions! I have had time to come back to this since the last few weeks, and plan to see this PR through as well; I haven't looked too closely at the changes again yet, but I'll have time to do so over the next few days. @LeonMatthes thanks for the offer! I'll get back to you once I've taken a look at all the changes. The fact that you've been using the library with your changes for quite some time without needing much else is really useful feedback though! :)

@futile
Copy link
Owner

futile commented Apr 6, 2022

Okay, just took a look at the changes again, and while I think that there are still some things to discuss/change, I think the changes are useful in general. @LeonMatthes can you merge master into this PR/your moduleverse branch and either push to this PR or open up a new one? I think that would be a good basis for discussion. Thanks!

Some stuff I think still needs some adjustments:

  • PeerIDs are useful, but I'm not sure if ENet reuses IDs/slots in the peers-array. If it does, I would like some kind of generation to be added to PeerIDs, so old/stale PeerIDs don't accidentally still "work", but refer to other Peers. If ENet doesn't reuse IDs/slots then this won't be needed, but I think its sensible to check this, because otherwise we basically end up with dangling pointers again (just in the form of IDs).
  • I think the Index and IndexMut impls for Host should go, as a Host isn't really a "list"/container of peers. If we want something like this, we could make a type Peers that can be retrieved from a Host, which can be indexed, iterated over, etc.. But I don't think this is necessary for now, Host::peer() and Host::peer_mut() seem ergonomic enough to me; But open for discussion here :)
  • Probably some cleanup of the history will also be necessary, the PR is already at 34 commits for ~ +350/-150 changes, which is too much imo. BUT since the branches have already diverged too far for a rebase anyway, we can just worry about this after everything else has been discussed. One solution I can see would be to simply take the whole diff once everything else has been resolved, and turn that into individual (new) commits, which then have clear scopes, commit messages, etc..

Once again, thanks for the PRs/discussion, much appreciated!

@LeonMatthes
Copy link
Contributor Author

Having generations for peers is definitely useful, I was concerned about "dangling" PeerIDs as well. AFAIK enet reuses them. I'll have to figure out a way to count the generations though.
I'll see when I have time to address your change requests. I'll also remove the Index/IndexMut.

@LeonMatthes
Copy link
Contributor Author

And thank you for reviewing this, it's really gotten too big over time.
I completely understand that it's a big ask of any maintainer to work with so much code that's just dumped on your head.

Thank you!

@JohnDowson
Copy link

IMHO, worrying about generational indexing could be postponed until another PR.

@futile
Copy link
Owner

futile commented Apr 7, 2022

@LeonMatthes thanks a lot! Feel free to just merge master into your branch and push that to this/a new PR to get started! :)

IMHO, worrying about generational indexing could be postponed until another PR.

The thing is, switching to non-generational indices would make the library interface less correct than it is now, which is something I generally avoid. I also don't think generational indices are that much of an (implementational) overhead, a simple version would just keep a Vec or a HashMap (or sth. similar) around to keep track of the current index, adjust it for connects/disconnects, and check it when indexing. If there are some issues I'm not seeing here, I would rather move PeerIDs to an individual PR, so the rest of this can be merged independently. However, I'd think that that's not necessary, and it should be fine to do that in this PR, since it already includes most of the index infrastructure.

But I agree that other new things should probably not be added to this PR anymore, so the scope stays where it is now.

Also, if you have some other changes you would like to get merged, feel free to put up a PR, and we can see whether the PRs overlap/in which order we should merge. If your changes are already based on @LeonMatthes's code, then it's probably best to wait for this PR I guess 😅 :)

@LeonMatthes
Copy link
Contributor Author

Hm, I've made the PeerID generational.
I've moved the drop_disconnected call from process_event to service, as otherwise the stale Peer would not be dropped until a new Event is received. Having the Peer be invalidated at the next service is much easier to understand in my opinion.

There is however, a problem: disconnect_now currently doesn't increment the Peer generation.
The peer doesn't have any way to access the Host, so it can't do the increment itself.

I'm considering storing the generation in the Peer data itself, that probably has its own problems...
Will have to give this some more thought. Any suggestions appreciated as well.

src/peer.rs Outdated
@@ -174,6 +176,7 @@ where
///
/// No `Disconnect` event will be created. No disconnect notification for the foreign peer is guaranteed, and this `Peer` is immediately reset on return from this method.
pub fn disconnect_now(&mut self, user_data: u32) {
// TODO: Increment the peer_generation for this peer.
Copy link
Owner

@futile futile Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the problem. Would it be possible to borrow the peer_generations HashMap from Host when we create the Peer? I'd think this might work, since Peer already borrows Host, so lifetime- and borrowing-wise this seems like it could work out. But there could just as well be lifetime or borrowing issues, so maybe another solution is needed.

Edit: Instead of borrowing the whole HashMap it might also be possible to just borrow a single entry/ref to a single entry, so we don't need to store the index in the Peer as well.

Copy link
Contributor Author

@LeonMatthes LeonMatthes Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds possible, the representation of the Peer will have to change again, but I think it's doable.
It's not the cleanest solution, but better than having incoherent lifetimes.

I think I might even take a look a the original enet code to get a better understanding of the peer lifetimes.
I'm guessing that a Peer will always continue to live, even after disconnect, until a new peer connects that takes its place in the host.

In that case, this wouldn't actually be an issue, and we could only increment the peer generation when a new Peer connects into an existing peer slot.
The PeerID would remain valid until that point, which accurately represents the Peer lifetime. If my assumptions are correct.
But there might also be enet-internal reasons a Peer is destructed after disconnect, possibly due to enet resizing the peers array to save storage.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I might even take a look a the original enet code to get a better understanding of the peer lifetimes.
I'm guessing that a Peer will always continue to live, even after disconnect, until a new peer connects that takes its place in the host.

If you want to take a look, that would be great! But I think going with the "safe" version of regarding a peer's lifetime as over once the peer disconnects makes sense, because a new peer could (theoretically) be connected very quickly, so there is nothing to rely on.

In that case, this wouldn't actually be an issue, and we could only increment the peer generation when a new Peer connects into an existing peer slot.
The PeerID would remain valid until that point, which accurately represents the Peer lifetime. If my assumptions are correct.
But there might also be enet-internal reasons a Peer is destructed after disconnect, possibly due to enet resizing the peers array to save storage.

This would be another possible solution if ENet guarantees this behavior. I'm not sure if ENet actually (ever) resizes the underlying array, it might just have a fixed (maybe user-chosen?) size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a look at the Enet implementation.
Peers are actually never re-allocated, as the user actually has to choose the exact number of peers that will be allocated when creating the enet host.
There is no way to change the number of peers afterwards.

Therefore, if we want to talk about "safe" lifetimes in Rust, technically, a reference to a peer can be valid the entire time the host exists.
So we technically don't even need the generational indexing at all, as the "peer" kind of always stays the same, no matter who is connected with it.

But if we want to define the peers lifetime as being defined by a single connection, things are more complicated.

As soon as the peer state changes to ENET_PEER_STATE_DISCONNECTED, it is "free for reuse" if you will.
With disconnect_now, this happens immediately. With the other disconnect cases, I'm pretty sure the state changes right before you get a "disconnect" event, so that is already handled by the current logic.
The peer being "free for reuse" means that when a new peer connects, the first peer that is marked as disconnected will be used for the new connection.
So in a way, my previous suggestion was right, that we could keep the previous peer alive until the next connection happens in that slot.

The problem with this approach is that a peer connection doesn't just happen immediately.
The peer can go through multiple stages of "this connection has started, but isn't yet complete" before you get the "connect" event.
Whilst this isn't that big of a deal, as you'll never get any events referencing that specific peer, you might notice that Enet is changing parts of the existing peer, like the state, before the previous PeerIndex loses its validity.

So we have 3 options here:

  1. peer lifetime == host lifetime
    • Closest to the Enet definition
    • Easiest to implement - No generational Indexing
    • Unexpected from a user perspective, as peer != connection
    • Requires keeping manual track of when a PeerIndex loses its association to a specific connection -> less rusty - sort of manual garbage collection
  2. Peer lifetime == connect -> disconnect
    • Easiest to understand - peer == connection
    • Most "correct" lifetime - no side effects from new connection in same peer
    • Hardest to implement - need to track every kind of disconnect (currently only disconnect_now & disconnect event)
  3. Peer lifetime == connect -> disconnect -> connect
    • Still relatively easy to understand - peer is alive until another connection takes its slot.
    • Easier to implement, only need to process connect event
    • Lifetime might end sporadically without notice (from user perspective)

In my opinion we should stick with Option 2, as its the most "correct" as long as we assume peer lifetime == connection. If you agree I can get started on implementing this like you suggested earlier.

Also, I've got another idea what to do with the data associated with a peer.
Currently it is dropped in the drop_disconnected call, which is called by the service function.
So the lifetime of a peer doesn't really end when the Disconnect event is received, but rather the next time you call the service function.
I guess @htrefil has implemented it this way so the user of the Host has the ability to re-claim or otherwise use the data before it is finally inaccessible.
It might therefore be valuable to, instead of simply "destroying" the data, give ownership of it back to the user.
I.e. the Disconnect EventKind could be expaned to also contain the data by-value.
That way it would be clear that the lifetime of the peer, as well as its data is strictly bound by the Disconnect event.
disconnect_now could of course also return the data associated with the peer.
Host could also be simplified and wouldn't need to keep track of the last disconnected peer.
It would just invalidate the peer index as soon as the disconnect is received and be done with it.
Enet resets all peer data (apart from the user data) at that point anyway.
So there is no reason to access a peer after this.

I hope this writeup is useful and can give you a better idea of peer lifetime :)

Let me know which of the 3 options you want me to implement (I think No. 2 is best) and whether you want me to rework the data dropping.
It might also make sense to do the data dropping in a separate PR, which I'd be happy to do as well.

Copy link
Owner

@futile futile Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking such an in-depth look and for the clear summary, super helpful!
I agree that option 2 sounds best. Option 1 does not feel intuitive at all (esp. in Rust), and option 3 feels spurious, as you said.
Option 2 seems like it would provide the nicest "rusty" usage of the library, so let's go with that!

On the point of what to do with data: I think your idea is good, returning the associated data in the Disconnect event/as the return value of disconnect_now.
What's the difference to how it's currently done? The Disconnect event currently contains the Peer that disconnected, which allows retrieving the associated user data (and the data is dropped when the event is dropped). Is it theoretically possible that a new connection has already been established for the same peer at this point?

Whatever we go with, it should also still be part of this PR imo, as otherwise we merge code that we don't want to keep around anyway/a feature we don't like, which doesn't sound good to me.

Thanks again for taking such an in-depth look, and for your work on this PR!

Copy link
Owner

@futile futile Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads-up: My thoughts go back and forth here a bit, (in)conclusion at end

Okay, I'll go with implementing No. 2 then once I find time for it.

Awesome, thanks a lot! :)

Regarding data: with this PR the Disconnect event only returns the PeerID, instead of the Peer. Which also means the Event no longer needs a lifetime, which is good. But, if we define the new Peer lifetime as "the peer dies as soon as Disconnect is received", the PeerID will no longer refer to any Peer. That by itself is fine, as you can still use the PeerID to map back to your own data associated with the connection. It does however make it impossible to get the data out of the peer.

Yeah, this was my understanding as well. What I meant is that currently, on master/the crates.io version, the Disconnect event contains a Peer - which allows the user to access/take the data (if they want), but also allows them to, e.g., call Peer::address() or Peer::mean_rtt(). This also feels a bit more future-proof to me. That is, if ENet ever adds another field like data to their peers, giving the user control of Peer won't require any more changes (to the Disconnect event at least). But if we directly store data in the Disconnect event, we will now also need to store the new field, and so on for future changes.

The downside of this variant is that giving the user the Peer during a Disconnect event also means the user can call, e.g., Peer::send_packet() or other functions that don't really make sense/are unclear.

I wonder whether we can't just keep it like this, and return the Peer in the Disconnect event. This also seems to be close to what ENet does, which also contains the peer (well, a pointer to it, not an index) in its disconnect event. This would not get rid of the lifetime on Event, but I don't really mind that, or has this lifetime been a hassle in practice?

I guess even though we have peer ids now, it might sometimes be more ergonomic to directly use Peers in parts of the API as well.

But, if we define the new Peer lifetime as "the peer dies as soon as Disconnect is received", the PeerID will no longer refer to any Peer.

Since ENet contains the Peer as part of its disconnect event, I think we can safely extend the lifetime of our Peer to until the Disconnect event is handled/dropped, without running into lifetime issues. However that would mean keeping drop_disconnected() around, effectively extending the lifetime of a Peer/PeerId to until service() is called again after handling the Disconnect event. This isn't super nice, but could be ok imo, as the benefit is that the user can still access the Peer while handling the Disconnect event.

But oof, this is a bit more annoying than I thought 🙈 What do you think? I would really like to keep the Peer around until Disconnect is handled/dropped, so that, e.g., Peer::address() can still be called (which I think is useful in practice), on the other hand it requires some extra bookkeeping.

Hope that explanation makes it a bit clearer :)

Hope I was also able to formulate what I meant this time :) Thanks for taking the time to work/communicate on this! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, then maybe extending the lifetime of a peer until the Disconnect Event is dropped might also make sense 🤔
As we'll have to have a way to end the Peers lifetime in disconnect_now anyway, we can just call that in the Drop implementation of the Disconnect Event 🤔

Then we wouldn't need drop_disconnected, as that would be handled by the Disconnect event.
And the user can then also control the lifetime of the Peer by keeping the Disconnect event around as long as needed.

That does still leave the user with the ability to call methods like send_packet, but these would then probably just fail by returning an error, which is fine.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yeah that sounds good, if we store a Peer in the Disconnect event, then we can use that in the Drop impl. Nice thinking! :) I would, however, not call disconnect_now() there, because the peer is already disconnected at that point, and disconnect_now() should correspond as directly as possible to ENet's disconnect_now() function. I would just make a new (private) fn called cleanup_after_disconnect() or similar, which is not user-facing and performs the required cleanup (i.e., dropping the user data).

Nice, good idea! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that was what I was thinking as well. I'll hopefully get around to implementing that this week 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I think I've got a working implementation now.

Instead of the Peer containing a reference to the HashMap, I opted to simply store the Generation inside the Peer.data itself.
That worked out quite well.
Whilst I was at it, I also expanded the data API a bit as well.
It's now split into 4 methods:

  • data - immutable access
  • data_mut - immutable access
  • set_data - Sets the data to a new value - no longer can insert None here.
  • take_data - Takes ownership of the data out of the Peer again
    I think a way to take ownership back was really missing here.

Furthermore, the Event had to change to no longer grant public access to all its members.
Otherwise, one might simply mutate the event's kind to prevent the dropping of the Peer data and ID invalidation.
The Drop implementation does make the Event a little bit more unwieldy, as the Event can no longer be taken apart easily.
I've added a take_kind method that can at least take ownership of the EventKind, which was actually necessary for my own project to work with this change.
However, having direct access to the Peer instead of just the PeerID in the Event did prove useful.

Overall, adding the generational indexing does make the code quite a bit more complex, but it does make a lot of sense to have the PeerID become invalid once the connection terminates.
I also positively noticed that when adapting my project to the new enet-rs implementation.

@LeonMatthes LeonMatthes force-pushed the master branch 2 times, most recently from f7dbc8b to b448b18 Compare May 5, 2022 12:58
Side effects: The members of the PeerEvent now need to be private to
make sure the Peer is correctly cleaned up.
Copy link
Owner

@futile futile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, went through the changes, thanks a lot for putting this together! There are still some small-ish points, but most things look good.

There is one bigger point that I just realized while going through the changes: Performing logic in the destructor of Event is somewhat problematic, as Rust generally considers mem::forget() safe. While (I don't think) we create unsafety when an Event is forgotten (without running its Drop impl), logic errors/dangling pointers become possible; I think it's still fair to go with the current design, but we should put a notice in the docs to make this clear.

/// Take the EventKind out of this event.
/// If this peer is a Disconnect event, it will clean up the Peer.
/// See the `Drop` implementation
pub fn take_kind(mut self) -> EventKind {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your use case for requiring this method? I would guess you want to take out the packet from a Receive-event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was my use case.
I have one more layer of abstraction on top of enet-rs in my current application.

I'm sending all received event kinds for each peer through crossbeam channels, such that my Peers just boil down to a single crossbeam receiver. Which is very multi-threading friendly.

That's why I need to be able to take ownership of the EventKind.
This entire change did make my codebase better though, as I no longer had to check if the peers state was disconnected, or Zombie or any of the other intermediate states a Peer might be in.
I could simply check if its PeerID was still valid :)

@LeonMatthes
Copy link
Contributor Author

@futile Sorry that it has taken me so long to get back to this.
I've incorporated your feedback wherever I didn't have questions and marked the conversations as resolved.
There's basically just a small question remaining about whether to continue using Duration in the API.

Considering that i.e. forgetting a std::Vec and other data structures in Rust should always be considered carefully, I think using a Drop impl for cleanup here is fine. I've added a comment nonetheless, like you suggested.

@Jachdich
Copy link

Is there any update on this? Thinking of using this library in my project and I'm just curious as to the sate of development.

@ryanmcgrath
Copy link

Seconding @Jachdich - curious if this is close to being merged?

@michidk
Copy link

michidk commented Apr 9, 2023

Wow, this PR is great - exactly what I was looking for. Looking at this PR, I would rather use this instead of the current master. Sometimes it's unavoidable to publish a forked crate. I would rather have multiple forks on crates.io than one outdated one.
@LeonMatthes what exactly is the difference between your moduleverse branch and this PR (your master branch)? moduleverse contains some older changes, right?
@futile is there anything preventing this from being merged and uploaded to crates.io?

@LeonMatthes
Copy link
Contributor Author

@LeonMatthes what exactly is the difference between your moduleverse branch and this PR (your master branch)? moduleverse contains some older changes, right? @futile is there anything preventing this from being merged and uploaded to crates.io?

Yeah, the moduleverse branch is a bit outdated now, it doesn't contain the generational PeerID yet.
It does however contain some Send, Sync and Hash implementations that aren't in this master branch yet.
Should probably open a separate PR for them...
If you don't need the Send,Sync or Hash impls that I added to the moduleverse branch, working with the branch of this PR should work well.
Can't guarantee any further breaking changes though.

I've been battling a bindgen-issue with enet-sys after updating my Fedora yesterday.
see: ruabmbua/enet-sys#19

Once the enet-sys stuff is sorted out, I'll consider uploading my own version to crates.io (maybe enet-rs-next?).
Want to avoid that if at all possible though, as it's much better for the community to not have to deal with multiple forks for the same thing.
For now, feel free to just use:

enet-rs={git="https://github.com/LeonMatthes/enet-rs.git"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants