Description
Currently (using cuML as an example here), the conda test environment initialization for most CI jobs looks something like creating the test environment:
rapids-dependency-file-generator \
--output conda \
--file_key test_python \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml
rapids-mamba-retry env create --force -f env.yaml -n test
And then downloading and installing build artifacts from previous jobs on top of this environment:
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)
...
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
libcuml cuml
In addition to forcing us to eat the cost of a second conda environment solve, in many cases this can cause some pretty drastic changes to the environment which can be blocking - for example, consider this cuML run which fails because conda is unable to solve a downgrade from Arrow 15.0.0 (build 5) to 14.0.1.
Our current workaround for this is to manually add pinnings to the testing dependencies initially solved such that the artifact installation can be solved, but this can introduce a lot of burden in needing to:
- identify what packages/changes are blocking artifact installation
- open PR(s) modifying the impacted repos
- follow up on each impacted repo to potentially remove the pinning later on
Would it be possible to consolidate some (or all) of these conda environment solves by instead:
- downloading the conda artifacts before creating the environment
- updating the dependencies.yaml (or, if this isn't possible, the generated conda environment file) to include the desired packages, making sure to explicitly specify source channel to ensure we're picking up the build artifacts
- creating the environment with this patched file
In my mind, the main blocker I could see to this working would be if rapids-download-conda-from-s3
requires some conda packages contained in the testing environment to work.
EDIT: Updating to capture state of other projects:
- rmm(Create Conda CI test env in one step rmm#1824, Consolidate more Conda solves in CI rmm#1828)
- raft(Create Conda CI test env in one step raft#2580, Consolidate more Conda solves in CI raft#2587)
- cudf(Create Conda CI test env in one step cudf#17995, consolidate more conda solves in CI cudf#18014)
- kvikio(Add CUDA libs in Python Conda, Consolidate Conda CI installs & use
rapids-dask-dependency
kvikio#513, Consolidate more Conda solves in CI kvikio#636) - cuml(Install test dependencies at the same time as cuml packages. cuml#5781, Consolidate more Conda solves in CI cuml#6321)
- cucim (Create Conda CI test env in one step cucim#833, Consolidate more Conda solves in CI cucim#835)
- cugraph(Create Conda CI test env in one step cugraph#4935)
- cugraph-gnn(Create Conda CI test env in one step cugraph-gnn#144)
- cumlprims_mg(no changes needed!)
- cuspatial(create conda ci test env in one step cuspatial#1387)
- cuvs(Create Conda CI test env in one step cuvs#684, Consolidate more Conda solves in CI cuvs#701)
- cuxfilter(Create Conda CI test env in one step cuxfilter#663, Consolidate more Conda solves in CI cuxfilter#664)
- dask-cuda(Create Conda CI test env in one step dask-cuda#1448, Consolidate more Conda solves in CI dask-cuda#1452)
- nx-cugraph(Create Conda CI test env in one step nx-cugraph#90)
- rapids-cmake(no changes needed!)
- rapids-dask-dependency(no changes needed!)
- ucx-py(Create Conda test environment in one go ucx-py#1101)
- ucxx(Create Conda CI test env in one step ucxx#373, Consolidate more Conda solves in CI ucxx#375)
- wholegraph(no changes needed! superseded by cugraph-gnn)
Activity