Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java: Fixing the broken pre-filtering for Brute Force #706

Open
wants to merge 5 commits into
base: branch-25.04
Choose a base branch
from

Conversation

chatman
Copy link
Contributor

@chatman chatman commented Feb 18, 2025

Changed the signature from long[] prefilter to BitSet[] prefilters.
Added a very simple test.

Copy link

copy-pr-bot bot commented Feb 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chatman chatman changed the base branch from branch-25.04 to branch-25.02 February 18, 2025 00:31
@chatman
Copy link
Contributor Author

chatman commented Feb 18, 2025

FYI @ChrisHegarty @narangvivek10.

@ChrisHegarty
Copy link
Contributor

relates #697

int64_t prefilter_shape[1] = {(n_queries * n_rows + 31) / 32};
DLManagedTensor prefilter_tensor = prepare_tensor(prefilter_data_d, prefilter_shape, kDLUInt, 32, 1, kDLCUDA);
// Parse the filters data
uint32_t *prefilters = (uint32_t*) malloc(prefilter_data_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it is necessary to malloc and copy the pre filter data. Is it not possible to access it directly from the passed pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change the byte ordering to make it work. Just type casting it to uint32_t * didn't work. I'll explore if doing some in-place transformation without allocating a new array is feasible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to write the expected byte order at the java level,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the C++ side I added last month an option to specify the byte order used at the time of the creation of the prefilter data: https://github.com/rapidsai/raft/blob/branch-25.04/cpp/include/raft/core/bitmap.hpp#L64.
I think in your case it was created with uint8_t? So specifying original_nbits = sizeof(uint8_t) should prevent you from that extra allocation/copy. It is not present yet in the C API though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisHegarty fixed, it was unnecessary to change the order. I was doing things wrong. Fixed now.

@cjnolet cjnolet added bug Something isn't working non-breaking Introduces a non-breaking change labels Feb 18, 2025
@rhdong
Copy link
Member

rhdong commented Feb 18, 2025

/ok to test

@rhdong
Copy link
Member

rhdong commented Feb 19, 2025

Hi @chatman , the style checking seems to fail, may you have time to take a look? thanks!

@chatman
Copy link
Contributor Author

chatman commented Feb 19, 2025

FYI, for larger dataset sizes, this fix is incorrect. Working on the proper fix.

@rhdong
Copy link
Member

rhdong commented Feb 19, 2025

/ok to test

1 similar comment
@rhdong
Copy link
Member

rhdong commented Feb 19, 2025

/ok to test

@chatman chatman changed the base branch from branch-25.02 to branch-25.04 February 19, 2025 17:45
@chatman chatman force-pushed the ishan/prefiltering-fix branch from a53dbf2 to b6a7661 Compare February 19, 2025 17:45
@chatman
Copy link
Contributor Author

chatman commented Feb 19, 2025

I think this is ready for merge. Please review @rhdong @ChrisHegarty @narangvivek10.

@rhdong
Copy link
Member

rhdong commented Feb 19, 2025

/ok to test

@chatman
Copy link
Contributor Author

chatman commented Feb 20, 2025

@rhdong Upon testing, I'm finding that on some systems, this is not working. On such systems, search call is returning non success return value, with invalid memory access. I copied the prefilter data into a device array and passed that array, and it works again. Wondering why it is working on one system with just the host array, but not on others. I'll continue the investigation and fix this by tomorrow (hopefully, before the freeze for the next release). FYI @cjnolet.

@rhdong
Copy link
Member

rhdong commented Feb 21, 2025

@rhdong Upon testing, I'm finding that on some systems, this is not working. On such systems, search call is returning non success return value, with invalid memory access. I copied the prefilter data into a device array and passed that array, and it works again. Wondering why it is working on one system with just the host array, but not on others. I'll continue the investigation and fix this by tomorrow (hopefully, before the freeze for the next release). FYI @cjnolet.

Hi @chatman , the API always uses the filter memory pointer as a device accessible one, a little weird that malloc can work. 🤔 May I have your reproduction code? or code snippet? Or the test configuration like dataset size, query size, etc. Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Introduces a non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants