Skip to content

word2vec.c: Minor tweak to reduce CPU pipeline stalls (3% gain) #36

Open
@GoogleCodeExporter

Description

The perf utility reported a bottleneck in this area, where it's waiting for the 
'target' variable to be finalized. By computing the next value of target early, 
overall performance is increased by about 3% on a small (128MB) training file.

add next_target next to target in the variable list.

Then in the two negative sampling blocks:
...
          if (d == 0) {
            target = word;
            label = 1;
            next_random = next_random * (unsigned long long)25214903917 + 11;
            next_target = table[(next_random >> 16) % table_size];
          } else {
            target = next_target;
            if (target == 0) target = next_random % (vocab_size - 1) + 1;
            next_random = next_random * (unsigned long long)25214903917 + 11;
            next_target = table[(next_random >> 16) % table_size];
            if (target == word) continue;
            label = 0;
          }
...

Original issue reported on code.google.com by [email protected] on 22 Jul 2015 at 6:40

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions