Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use L2Expanded/L2SqrtExpanded instead of the *Unexpanded variants #6303

Open
wants to merge 3 commits into
base: branch-25.04
Choose a base branch
from

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Feb 7, 2025

The *Expanded variants were already used in most places, but not everywhere. Using the *Expanded versions should be more performant in all cases, and switching to be uniform across cuml gives more expected behavior.

@jcrist jcrist requested a review from a team as a code owner February 7, 2025 20:56
@jcrist jcrist requested review from dantegd and bdice February 7, 2025 20:56
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Feb 7, 2025
@jcrist jcrist added bug Something isn't working non-breaking Non-breaking change labels Feb 7, 2025
@dantegd
Copy link
Member

dantegd commented Feb 8, 2025

It seems that the small numeric changes are making one or a few tests get out of their error margin. We should run some of these tests a number of times (at least 20... more if possible) and see in what margin of error they fall and if it's still an acceptable margin (probably)

Previously we were using `L2Unexpanded`/`L2SqrtUnexpanded`, which are
more costly.
Previously we were using `L2SqrtUnexpanded`, which is more costly.
@jcrist
Copy link
Member Author

jcrist commented Feb 14, 2025

There are 3 consistent failures, all in test_nearest_neighbors_rbc (link).

2 are tolerance issues - bumping atol from 1e-3 to 6e-3 fixes things.

The 3rd is for euclidean with dims=3, n_neighbors=25 - the number of mismatched indices exceeds the threshold (max of 3 is allowed, we now consistently get 37).

If no one is opposed, I think I'll revert the change in pairwise_distances, but keep it in umap/dbscan, and open a follow-up issue to investigate.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (branch-25.04@a2a8080). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-25.04    #6303   +/-   ##
===============================================
  Coverage                ?   67.07%           
===============================================
  Files                   ?      202           
  Lines                   ?    13076           
  Branches                ?        0           
===============================================
  Hits                    ?     8771           
  Misses                  ?     4305           
  Partials                ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jcrist
Copy link
Member Author

jcrist commented Feb 15, 2025

This is all green now, should be good-to-go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants