Skip to content

Conversation

@roman-kulish
Copy link
Contributor

@hipertracker do you mind another PR?

Changes:

  • using xxhash for string hashing instead of Golang internal runtime.memhash (internal functions may not be safe)
  • using official collator for sorting instead of Tidwall's, this should also slightly improve sorting performance
  • removed size hints for unique words dictionary: if you don't know how much data you need to store, don't preallocate the space 🤷‍♂️
  • in the last commit I accidentally used Latin version of words extraction, in this PR I switched back to the proper Unicode one.

This version is a bit slower than the previous one (because of use of Unicode word extraction and no dictionary preallocation), but it is how I would do it, if I am not chasing extreme performance.

Could you also add RAM usage to your benchmark?

Cheers

@hipertracker
Copy link
Owner

What is the best way to check RAM usage of the running Golang script? MacOS has a tool Activity Monitor but it can't display anything running less than a second. Golang script with sorting showed up in this tool and used about 50MB.

@hipertracker hipertracker merged commit 984abbd into hipertracker:master Feb 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants