Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented build.build-dir config option #15104

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ranger-ross
Copy link
Contributor

@ranger-ross ranger-ross commented Jan 26, 2025

What does this PR try to resolve?

This PR adds a new build.build-dir configuration option that was proposed in #14125 (comment)

This new config option allows the user to specify a directory where intermediate build artifacts should be stored.
I have shortened it to just build-dir from target-build-dir, although naming is still subject to change.

What is a final artifact vs an intermediate build artifact

Final artifacts

These are the files that end users will typically want to access directly or indirectly via a third party tool.

Intermediate build artifact, caches, and state

These are files that are used internally by Cargo/Rustc during the build process

  • other depinfo files (generated by rustc, fingerprint, etc. See https://github.com/rust-lang/cargo/blob/master/src/cargo/core/compiler/fingerprint/mod.rs#L164)
  • rlibs and debug info from dependencies
  • build script OUT_DIR
  • output from proc macros (previously stored in target/build)
  • incremental build output from rustc
  • fingerprint files used by Cargo for rebuild detection
  • scratchpad used for cargo package verify step
  • Cache of rustc invocations (.rustc_info.json)
  • "pre and non uplifted" binary executables. (ie. bins for examples that contain the hash in the name, bins for benches, proc macros, build scripts)
  • CARGO_TARGET_TMPDIR files (see rational for this here)
  • future-incompat-report's .future-incompat-report.json file

Feature Gating Strategy

We are following the "Ignore the feature that is used without a gate" approach as described here.

The rational for this is:
The build.build-dir is likely going to be set by by users "globally" (ie. $CARGO_HOME/config.toml) to set a shared build directory to reduce rebuilding dependencies. For users that multiple cargo versions having having an error would be disrupted.
The fallback behavior is to revert to the behavior of the current stable release (building in $CARGO_TARGET_DIR)

Testing Strategy

  • We have the existing Cargo testsuite to be sure we do not introduce regressions.
    • I have also run the testsuite locally with the cli flag remove to verify all tests pass with the default build dir (which falls back to the target dir)
  • For testing thus far, I have been using small hello world project with a few dependencies like rand to verify files are being output to the correct directory.
  • When this PR is closer to merging, I plan to test with some larger projects with more dependencies, build scripts, ect.
  • Other testing recommendations are welcome 🙇

How should we test and review this PR?

This is probably best reviewed commit by commit. I documented each commit.
I tied to follow the atomic commits recommendation in the Cargo contributors guide, but I split out some commits for ease of review. (Otherwise I think this would have ended up being 1 or 2 large commits 😅)

Questions

  • What is the expected behavior of cargo clean?
  • When using cargo package are was expecting just the .crate file to be in target while all other output be stored in build.build-dir? Not sure if we consider things like Cargo.toml, Cargo.toml.orig, .cargo_vcs_info.json part of the user facing interface.
    • Current consensus is that only .crate is considered a final artifact
  • Where should cargo doc output go? HTML/JS for many crates can be pretty large. Moving to the build-dir would help reduce duplication if we find the that acceptable. For cargo doc --open this is not a problem but may be problematic for other use cases?
  • Are bins generated from benches considered final artifacts?
    • Since bins from examples are considered final artifacts, it seems natural that benches should also be considered final artifacts. However, unlike examples the benches bins are stored in target/{profile}/deps instead of a dedicated directory (like target/{profile}/examples). We could move them into a dedicated directory (target/{profile}/benches) but that mean would also be changing the structure of the target directory which feels out of scope for this change. If we decide that benches are final artifacts, it would probably be better to consider that changes as part of --artifact-dir (nee --out-dir) Tracking Issue #6790
    • Answer: Implemented build.build-dir config option #15104 (comment)
  • Do we want to include a CARGO_BUILD_DIR shortcut env var?
    • The current commit (2af0c91) has included the CARGO_BUILD_DIR shortcut. This can be removed before merging if there a good reason to.

TODO

  • Implementation
    • Add support in cargo clean
    • Implement templating for build.build-dir
    • Fix issue with target/examples still containing "pre-uplifted" binaries
    • Verify build-dir with non-bin crate types
  • Prepare for review
    • Clean up/improve docs
    • Review tests and add more as needed
    • Fix tests in CI (Windows is currently failing)
    • Clean up commits
    • Resolve remaining questions
  • Request review

@rustbot
Copy link
Collaborator

rustbot commented Jan 26, 2025

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-build-scripts Area: build.rs scripts A-configuration Area: cargo config files and env vars A-documenting-cargo-itself Area: Cargo's documentation A-filesystem Area: issues with filesystems A-future-incompat Area: future incompatible reporting A-layout Area: target output directory layout, naming, and organization A-rebuild-detection Area: rebuild detection and fingerprinting A-unstable Area: nightly unstable support A-workspaces Area: workspaces Command-package S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 26, 2025
@ranger-ross ranger-ross changed the title Added build-directory unstable feature flag Implemented build.build-dir config option Jan 26, 2025
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure I didn't miss something. From what I can tell these directories/files have been removed right?

  • target/<profile>/.metabuild
  • target/<profile>/.fingerprint
  • target/<profile>/deps
  • target/<profile>/incremental
  • target/<profile>/build
  • target/.cargo-lock
  • target/tmp
  • target/.rustc_info.json

tests/testsuite/build_dir.rs Show resolved Hide resolved
src/cargo/core/compiler/build_runner/mod.rs Outdated Show resolved Hide resolved
@ranger-ross
Copy link
Contributor Author

Just want to make sure I didn't miss something. From what I can tell these directories/files have been removed right?

  • target/<profile>/.metabuild
  • target/<profile>/.fingerprint
  • target/<profile>/deps
  • target/<profile>/incremental
  • target/<profile>/build
  • target/.cargo-lock
  • target/tmp
  • target/.rustc_info.json

Yes, that with the exception of target/.cargo-lock.
I think we will still want this cargo lock for backwards compatibility with previous versions of cargo.

Ideally in the longer term it can be removed in favor of fine grain locking like #4282

So a typical target directory will be something like

target
├── CACHEDIR.TAG
└── debug
    ├── .cargo-lock
    ├── examples
    └── hello_world // (the binary)

tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
@epage
Copy link
Contributor

epage commented Jan 27, 2025

What is the expected behavior of cargo clean?

It should clean the build dir

@epage
Copy link
Contributor

epage commented Jan 27, 2025

When using cargo package are was expecting just the .crate file to be in target while all other output be stored in build.build-dir? Not sure if we consider things like Cargo.toml, Cargo.toml.orig, .cargo_vcs_info.json part of the user facing interface.

imo The artifact for cargo package is the .crate. Everything else is part of the "build" process.

@epage
Copy link
Contributor

epage commented Jan 27, 2025

Can we call out explicitly what our testing strategy is?

We likely should also explicitly document in the PR what is considered an artifact and what is a build output and make sure we have tests for these.

@ranger-ross
Copy link
Contributor Author

One other question that came to my mind was the output of cargo doc. HTML/JS for many crates can be pretty large. Moving to the build-dir would help reduce duplication if we find the that acceptable. For cargo doc --open this is not a problem but may be problematic for other use cases?

Perhaps symlinking the index.html from the build dir into target could be an option if we care about keeping an entry point.

@ranger-ross
Copy link
Contributor Author

Can we call out explicitly what our testing strategy is?

We likely should also explicitly document in the PR what is considered an artifact and what is a build output and make sure we have tests for these.

@epage sure, I updated the PR description but let me know if I missed anything.

@epage
Copy link
Contributor

epage commented Jan 28, 2025

One other question that came to my mind was the output of cargo doc. HTML/JS for many crates can be pretty large. Moving to the build-dir would help reduce duplication if we find the that acceptable. For cargo doc --open this is not a problem but may be problematic for other use cases?

imo cargo docs output is an artifact that people will want access to. I suspect it'd be a breaking change to move it out of target-dir.

@epage
Copy link
Contributor

epage commented Jan 28, 2025

depinfo files (.d files)

There are multiple types of depinfo files. I suspect the ones next to final artifacts are also considered final artifacts, see https://doc.rust-lang.org/cargo/reference/build-cache.html#dep-info-files

@epage
Copy link
Contributor

epage commented Jan 28, 2025

FYI I added to the PR description a couple more intermediate artifacts

  • rlibs and debug info from dependencies
  • build script OUT_DIR

When are workspace member rlibs considered final artifacts? We're putting them in target/<profile> at times, so I take it that has already been answered.

tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
src/cargo/util/context/mod.rs Outdated Show resolved Hide resolved
tests/testsuite/build_dir.rs Outdated Show resolved Hide resolved
@ranger-ross ranger-ross force-pushed the target-build-dir branch 3 times, most recently from 2af0c91 to 1a326c2 Compare February 9, 2025 07:47
@ranger-ross
Copy link
Contributor Author

ranger-ross commented Feb 9, 2025

I think this PR is now complete to the point that I will mark this PR as ready to review.

Since the last review I have:

  • Added a shortcut for CARGO_BUILD_DIR
  • Updated the docs in unstable.md
  • Cleaned up/fixed the tests and added a few more different crate types
  • Updated the PR description with a feature gating strategy

r? @epage

@rustbot rustbot assigned epage and unassigned ehuss Feb 9, 2025
@ranger-ross ranger-ross marked this pull request as ready for review February 9, 2025 08:35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #15104 (comment)

Can we call out explicitly what our testing strategy is?

We likely should also explicitly document in the PR what is considered an artifact and what is a build output and make sure we have tests for these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #15104 (comment)

@epage sure, I updated the PR description but let me know if I missed anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #15104 (comment)

Sorry if I wasn't clear but my discussion of testing strategy was in the context of tracking the categorization of artifacts. I'm not saying we have to exhaustively test it but we should think about it and document the choice made and why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I misunderstood your request. (perhaps multiple times)

I added a doc comment at the top of build_dir.rs with a summary of the test strategy for that file and the rational why.

Let me know if that is what you were asking for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ideal scenario, we verify

  • each file listed under artifact-dir vs build-dir is in the right place
  • ensure new files get categorized correctly

I'm assuming we can't meet that ideal. I would like to understand why we can't meet that ideal and understand how close we should get to it along with a call out of what gaps are left.

These ares are in preparation to split target-dir into artifact-dir and build-dir
This is in preparation for splitting the intermediate build artifacts
from the `target` directory.
This commit adds a `build_dir` option to the `build` table in
`config.toml` and adds the equivalent field to `Workspace` and `GlobalContext`.
This commits implements the seperation of the intermidate artifact
directory (called "build directory") from the target directory. (see rust-lang#14125)
@rustbot
Copy link
Collaborator

rustbot commented Feb 11, 2025

☔ The latest upstream changes (possibly 321f14e) made this pull request unmergeable. Please resolve the merge conflicts.

if !self.cli_unstable().build_dir {
return self.target_dir();
}
if let Some(dir) = self.get_env_os("CARGO_BUILD_DIR") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand CARGO_BUILD_DIR is proposed here because we already have CARGO_TARGET_DIR. However I feel like CARGO_TARGET_DIR was a mistake.

The build.build-dir config already comes with a CARGO_BUILD_BUILD_DIR environment variable for free. We may want to stick with it and stop stacking up extra environment variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is open for discussions btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I feel like CARGO_TARGET_DIR was a mistake.

Can I ask why you feel it was a mistake?

As a user of cargo I think it might be a bit confusing to see BUILD show up twice in the same variable name.
It make sense when strictly following the Cargo configuration, but newcomers are often not aware of this.

I think there is value in providing a shortcut that is easily understandable.

Copy link
Member

@weihanglo weihanglo Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask why you feel it was a mistake?

It's a hindsight honestly. It is ideal to me that each environment variable maps to one configuration field, and vice versa. We have cargo --config available already if people want to pass complex value.

CARGO_TARGET_DIR is a standalone environment variable which cargo and other community tools need to handle specially and understand the precedence of these options. If it were only one CARGO_BUILD_TARGET_DIR then people don't need to deal with two environment variables. And the --target-dir flag can just be a alias equivalent to --config 'build.target-dir=foo', and could eliminate the special handling.

It make sense when strictly following the Cargo configuration, but newcomers are often not aware of this.

Could you elaborate a bit more on this? In the book there is a list of variables for each configuration field, which I assume no discovery issue here. (Could expand the doc of CARGO_BUILD_TARGET_DIR, if the other one is removed, yet we won't remove it for sure.)

it might be a bit confusing to see BUILD show up twice in the same variable name.

Feel like it is a naming issue, regardless whether we have CARGO_BUILD_DIR variable or not.

Copy link
Contributor

@epage epage Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the --target-dir flag can just be a alias equivalent to --config 'build.target-dir=foo', and could eliminate the special handling.

The plan was to not offer a --build-dir CLI option and build-dir would eventually be decoupled from --target-dir

Copy link
Member

@weihanglo weihanglo Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is value in providing a shortcut that is easily understandable.

Not sure about “shortcut”. If you meant 6-character shorter yeah it is, though usablility depends on whether people type it everyday in interactive sessions. I would assume the majority of users set it once in shell startup script, or in ~/.cargo/config.toml. Then the length doesn't really matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unstable, so we don't have to have a final decision. The question is mostly if there is enough reason for or against it for the initial implementation.

While I lean towards CARGO_BUILD_DIR, starting with fewer features makes it easier to identify when a feature is needed compared to when you offer everything, its hard to tell what is used and could be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate a bit more on this? In the book there is a list of variables for each configuration field, which I assume no discovery issue here. (Could expand the doc of CARGO_BUILD_TARGET_DIR, if the other one is removed, yet we won't remove it for sure.)

I think that anyone remotely familiar with Rust can fairly quickly understand CARGO_TARGET_DIR does without going to the documentation page. Seeing CARGO_BUILD_BUILD_DIR is much less obvious and will probably require some google searching to figure out what it does.

But perhaps to your later point, if people just set it once in their shared cargo config, perhaps it does not matter that much.

While I lean towards CARGO_BUILD_DIR, starting with fewer features makes it easier to identify when a feature is needed compared to when you offer everything, its hard to tell what is used and could be removed.

I think this is fair. I can remove it from this PR and we can revisit if it's really needed later

let dest = root.join(dest);
// If the root directory doesn't already exist go ahead and create it
// here. Use this opportunity to exclude it from backups as well if the
// system supports it since this is a freshly created folder.
//
paths::create_dir_all_excluded_from_backups_atomic(root.as_path_unlocked())?;
if root != build_root {
paths::create_dir_all_excluded_from_backups_atomic(build_root.as_path_unlocked())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really on topic, but this reminds me #11548 and #15061.

For the new build-dir, we don't have backward compatibility issue, and files inside are really intermediate caches. We might want to reconsider the self-ignoring directory approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-execution Area: anything dealing with executing the compiler A-build-scripts Area: build.rs scripts A-configuration Area: cargo config files and env vars A-documenting-cargo-itself Area: Cargo's documentation A-filesystem Area: issues with filesystems A-future-incompat Area: future incompatible reporting A-layout Area: target output directory layout, naming, and organization A-rebuild-detection Area: rebuild detection and fingerprinting A-unstable Area: nightly unstable support A-workspaces Area: workspaces Command-clean Command-package S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants