Support for creating TrackerPayload asynchronously (close #222) by paulboocock · Pull Request #226 · snowplow/snowplow-java-tracker

Paul Boocock (paulboocock) · 2020-05-05T19:12:48Z

This PR was initially based on and takes inspiration from the fork within issue #222.

I've removed the need for a new API method and I've made the async payload creation the default behaviour when using Tracker.track().

I've also gone a little beyond simply moving the payload, I've also improved the asynchronous behaviour of the Tracker so that we can remove the syhconrized keyword from the BatchEmiiter.emit() method. This was tricky to solve due to the relationship between emit(), flushBuffer and close(), in the end I've ended up with two LinkedBlockingQueues, in a producer (emit()) and consumer (getBufferConsumerRunnable()) model, that I believe will give better throughput and if nothing else allows the emit() method to return much quicker, so the host application can continue with it's work and not have to wait for the java-tracker. I've done quite a bit of testing of this but if you see (or even fear) any potential pitfalls then please let me know.

There is also one breaking API change with the release. Rather than exposing the internal TrackerPayload on the API, I now have the ability to return the original Event so that is what I've done. This means the standard track(Event) method can be used to retry failed sends.

I've left my commit stream intact, I'll rebase this before merging into the release branch.

…load.

…ration

Paul Boocock (paulboocock) · 2020-05-10T06:46:33Z

There are two possible options for emit() in BatchEmitter. It's unlikely either of these worst case scenarios will ever happen, the queue is unbounded so the capacity is Integer.MAX_VALUE but if events are being added faster than they can be consumed then the buffer may become full.

Option 1: Will block on put() if eventBuffer becomes full. This means we won't lose events but it would block the thread that is adding the event, potentially the main thread of the hosting application. This could lead to a negative impact on the hosting application.

    public void emit(final TrackerEvent event) {
        try {
            eventBuffer.put(event); //Add to buffer and quickly return back to application
        } catch (Exception e) {
            LOGGER.error("Unable to add event to emitter", e);
        }
    }

Option 2: Will not block on offer() if eventBuffer becomes full but will throw away the event. This means we will lose events but it would never block the thread that is adding the event, removing the risk of blocking the main thread of the hosting application.

    public void emit(final TrackerEvent event) {
        boolean result = eventBuffer.offer(event); // Add to buffer and quickly return back to application
        
        if (!result) {
            LOGGER.error("Unable to add event to emitter, emitter buffer is full");
        }
    }

I've switched to Option 2, whilst either scenario is unlikely, I believe this is most similar to previous behaviour and the risk of tracking impacting the main purpose of the application doesn't feel like the right choice.

Paul Boocock (paulboocock) · 2020-06-06T12:40:43Z

Made the changes based on your suggestions. Nice ideas!

Removing the payload cache also had an unintended consequence in that that cache accidently made the BatchEmiiterTests pass, as the STM parameter was added to the cached payload reference when in reality it shouldn't have been added to these caches payloads.
This made the tests fail as the Maps didn't equal each other, so I've brought in Hamcrest to do some better assertions on the Map entries, ignoring extra entries that might be present in the captured payload and checking all the values sent from the original event are present. This leaves other tests to test additinal params, like STM.

I've also removed the ability to mutate the tracker properties once the tracker has been constructed (small breaking API change but this is 0.x). This seems particularly important now that the payloads are created asynchronously and changing tracker parameters could lead to events already passed to the tracker ended up being created based on parameters that have been adjusted after they have been passed to the tradcker. If users want to change tracker parameters, we will suggest constructing a new instance of Tracker using the TrackerBuilder.

Ian Streeter (istreeter)

Looks very neat now. No more comments to add 👍

bbplanon and others added 4 commits March 27, 2020 16:18

Added support for asynchronous conversion of an Event to a TrackerPay…

12c5318

…load.

Revert emit to synchronized method

2f13025

Support for creating TrackerPayload asynchronously (close #222)

ce8a554

Update failed events to emit original Event

80051bc

Paul Boocock (paulboocock) requested a review from Ian Streeter (istreeter) May 5, 2020 19:12

Paul Boocock added 5 commits May 6, 2020 17:04

Moved unnesting of EcommerceTransactionItems into TrackerPayload gene…

aef827b

…ration

Switch BatchEmitter to use a concurrent Producer/Consumer model

b7edbdd

Fix flushBuffer of BatchEmitter with new Producer/Consumer model

feb4010

Fix test timing

f3db245

Fix order inconsistency in test with flushBuffer

06abc89

Paul Boocock (paulboocock) marked this pull request as ready for review May 9, 2020 21:48

Paul Boocock added 2 commits May 10, 2020 07:24

Maintain order of events in flushBuffer

cc8002b

Switch emit to be non-blocking

365ad66

Paul Boocock added 2 commits May 10, 2020 08:09

Update comments

167c4a0

Remove unused import

ba999cb

Ian Streeter (istreeter) reviewed May 10, 2020

View reviewed changes

Paul Boocock added 5 commits June 5, 2020 10:50

Extract event sending function

14d775e

Move bufferSize functions into BatchEmitter

a49bfaa

Remove payload cache

39cd4c8

Introduce Tracker Parameters to Tracker Events

67ca8df

Add Hamcrest

efc9721

Paul Boocock (paulboocock) requested a review from Ian Streeter (istreeter) June 6, 2020 12:36

Ian Streeter (istreeter) approved these changes Jun 8, 2020

View reviewed changes

Paul Boocock (paulboocock) merged commit 4d81f0e into release/0.10.0 Jun 8, 2020

Paul Boocock (paulboocock) deleted the issue/222-async_payload branch June 8, 2020 10:57

Paul Boocock (paulboocock) linked an issue Jun 8, 2020 that may be closed by this pull request

Switch Emitter to use threadsafe collection for buffer #225

Closed

Miranda Wilson (mscwilson) mentioned this pull request Jan 5, 2022

Refactor TrackerEvents for event payload creation #291

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for creating TrackerPayload asynchronously (close #222)#226

Support for creating TrackerPayload asynchronously (close #222)#226
Paul Boocock (paulboocock) merged 18 commits into
release/0.10.0from
issue/222-async_payload

Paul Boocock (paulboocock) commented May 5, 2020 •

edited

Loading

Uh oh!

Paul Boocock (paulboocock) commented May 10, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Paul Boocock (paulboocock) commented Jun 6, 2020 •

edited

Loading

Uh oh!

Ian Streeter (istreeter) left a comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

Paul Boocock (paulboocock) commented May 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Paul Boocock (paulboocock) commented May 10, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Paul Boocock (paulboocock) commented Jun 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ian Streeter (istreeter) left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Paul Boocock (paulboocock) commented May 5, 2020 •

edited

Loading

Paul Boocock (paulboocock) commented Jun 6, 2020 •

edited

Loading