Skip to content

Conversation

@feross
Copy link
Member

@feross feross commented Feb 3, 2017

Possibly fixes: #196

Close websockets when peers are evicted from LRU cache, otherwise it's
possible for a peer object to be evicted from the LRU cache without the
socket being cleaned up. That will leak memory until the websocket is
closed by the remote client. It also messes up the stats.

if (peer.socket) {
setTimeout(function () {
try {
peer.socket.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we close the socket (so it sends stopped) and then create the timeout that nullifies it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the socket is closed by the server, there is no way to send messages to the client. So I think it's fine to set peer.socket to null immediately after we close the socket.

The timeout is only needed for the case where the client sends a 'stopped' message. When the server gets the message it calls lru.remove() which emits 'evict' immediately. We need the timeout so that the server can send a response with socket.send().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LRU .remove method does not emit evict. See chriso/lru/index.js#L30

But if the peer is getting evicted because of LRU limit, it will be evicted and stopped message won't ever be received, because we are not receiving anything. We should perform the stuff done in the event of stoppped message when the eviction happens. No need for timeout. I would however close and nullify on nextTick.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LRU .remove method does not emit evict.

Good point. I needed the setTimeout to get the tests to pass, but it looks like I misunderstood what remove() was doing. I'll look again.

We should perform the stuff done in the event of stopped message when the eviction happens.

That's what happens when we call socket.close(). The onWebSocketClose() function will run, and that does a simulated 'stopped' event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, right. Then try closing but nullifying on nextTick.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to nullify on nextTick. Doing it in the same tick should work fine, right?

this.peers = new LRU({
max: server.peersCacheLength || 1000,
maxAge: server.peersCacheTtl || 900000 // 900 000ms = 15 minutes
maxAge: server.peersCacheTtl || 20 * 60 * 1000 // 20 minutes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep it low? I mean, clients are supposedly updating every 2 minutes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is used for the udp and http trackers too.

I took a look at the source of opentracker (the most popular tracker software) and noticed they use 30 minutes on the server, but they tell clients to announce every 15 minutes by default.

We're currently telling clients to announce every 15 minutes and purging them at the same time. Which means if they're only a few seconds late, they'll already be removed, and they'll need to be re-added to the LRU. Better to give them some leeway.

For websocket connections, I think this value probably matters less. If the client closes the websocket, they are removed immediately (now that we added the 'evict' handling). So, the only possible downside of a large value for a websocket tracker is if someone opens a websocket but doesn't send any data on it for a long time. The server will keep the connection open for 20 minutes now. But no well-behaved clients should do that, right?

The LRU is shared for all tracker types, so we need to keep it > 15 min.

@DiegoRBaquero
Copy link
Member

What if you get evicted from one swarm but are active in others? I think we shouldn't close the socket in that case.

@feross
Copy link
Member Author

feross commented Feb 3, 2017

What if you get evicted from one swarm but are active in others? I think we shouldn't close the socket in that case.

Another good point 👍 I'll fix that up.

@feross
Copy link
Member Author

feross commented Feb 3, 2017

Fixed up.

debug('websocket close %s', socket.peerId)

if (socket.peerId) {
socket.infoHashes.slice(0).forEach(function (infoHash) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.slice(0) does nothing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clone the array before looping over it, otherwise it behaves incorrectly because we're mutating it inside the loop here: https://github.com/feross/bittorrent-tracker/pull/198/files/ebe3c218feab4f5a6774d5092531c6b2e55fe608#diff-831f46884145ede920506b29d5681ab2R104

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// so the stats get updated correctly.
this.peers.on('evict', function (data) {
var peer = data.value
this.announce({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work, peer is already evicted. _onAnnounce gets it and passes it to _onAnnounceStopped as null/undefined.

We should only perform what's inside _onAnnounceStopped:

if (peer.complete) this.complete -= 1
else this.incomplete -= 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Also, I was using this instead of self here.

feross added 15 commits February 8, 2017 12:27
Possibly fixes: #196

Close websockets when peers are evicted from LRU cache, otherwise it's
possible for a peer object to be evicted from the LRU cache without the
socket being cleaned up. That will leak memory until the websocket is
closed by the remote client. It also messes up the stats.
before this change, it was getting initialized immediately, since it
was outside a tape test block
Caught this issue because of the new eviction tests. Essentially, this
change moves the socketPool into the client instance instead of a
reused variable at the module level.

When a client sends stop (or is evicted) the server will close the
websocket connection if that client is not in any other swarms (based
on peerId). However, if we are using a single socket for multiple
clients (as was the case before this commit), then other clients will
have their sockets unintentionally closed by the server.
@feross
Copy link
Member Author

feross commented Feb 8, 2017

I just added a test to ensure that eviction works correctly. And in the process of adding that test, fixed another bug related to how the socketPool works.

I think this is finally ready to merge.

@feross
Copy link
Member Author

feross commented Feb 9, 2017

You're right about nothing being done with the params, but I wanted to be consistent in case they are used in that function later. Thanks again for the thorough review!

@feross feross merged commit a469740 into master Feb 9, 2017
@feross feross deleted the fix-196 branch February 9, 2017 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

incomplete + complete != # of peers

3 participants