Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow UCM commands to have “unprocessed” arguments #5549

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

sellout
Copy link
Contributor

@sellout sellout commented Jan 17, 2025

Overview

There are some commands (like run) that don’t want expanded arguments passed to them. E.g., run main 1 should pass “1” to main, not the numbered result from a previous command.

Fixes #2805

Implementation notes

This makes a couple changes. First, it creates a Parameters structure that gives a more precise definition of the arguments a command can receive, and eliminates incorrect parameter definitions. It then adds an isStructured field to the ParameterType (née ArgumentType) record, which indicates whether arguments to parameters of that type should be parsed & have numbers expanded.

One of the complicated bits (as seen in foldParamsWithM) is that an “input argument” can be expanded to multiple arguments, and we have to process the arguments left to right, because we don’t know which parameter an argument applies to until after we’ve expanded all the ones ahead of it.

Test coverage

This adds a transcript for run that shows the new behavior (as best we can without capturing run output in transcripts).

Loose ends

This is an improved version of the first half of #5480, which also “fixed” #5193. The work here is also progress toward #5193, but the approach in #5480 isn’t great.

The new transcript here also explicitly use lib.install @unison/base, despite being in the transcripts-using-base directory, which I would think would make that unnecessary.

@aryairani
Copy link
Contributor

Great!

@aryairani
Copy link
Contributor

aryairani commented Jan 21, 2025

IIRC, it used to be that the transcript executable actually launched stack exec unison transcript _base.md <another.md>, which applies both transcripts to the same codebase. _base.md just loads and adds base.u.

transcript no longer launches subprocesses, but it looks like the multi-transcript functionality is preserved in the api

buildTests :: TestConfig -> TestBuilder -> FilePath -> Maybe FilePath -> Test ()
buildTests TestConfig {..} testBuilder inputDir outputDir = do
io . putStrLn . unlines $
[ "",
"Searching for transcripts to run in: " ++ inputDir
]
files <- io $ listDirectory inputDir
-- Any files that start with _ are treated as prelude
let (prelude, transcripts) =
files
& sort
& filter (\f -> let ext = takeExtensions f in ext == ".md" || ext == ".markdown")
& partition (isPrefixOf "_" . snd . splitFileName)
-- if there is a matchPrefix set, filter non-prelude files by that prefix - or return True
& second (filter (\f -> maybe True (`isPrefixOf` f) matchPrefix))
case length transcripts of
0 -> pure ()
-- EasyTest exits early with "no test results recorded"
-- if you don't give it any tests, this keeps it going
-- till the end so we can search all transcripts for
-- prefix matches.
_ -> tests (testBuilder runtimePath inputDir (fromMaybe inputDir outputDir) prelude <$> transcripts)

It looks like, for a given directory that we're going to run transcripts out of, we first look for any transcripts like _xxx.md, and run those first before each individual transcript, which should have the effect of incorporating base in this case (since _base.md is in transcripts-using-base/). So it ShOuLd be working automatically.

Apart from that, I noticed we do see this pattern in many transcripts now, e.g. idempotent/builtins.md

scratch/main> builtins.mergeio
scratch/main> load unison-src/transcripts-using-base/base.u
scratch/main> add

Will that work for fix-2805.md to avoid a clone?

@sellout
Copy link
Contributor Author

sellout commented Jan 21, 2025

IIRC, it used to be that the transcript executable actually launched stack exec unison transcript _base.md <another.md>, which applies both transcripts to the same codebase. _base.md just loads and adds base.u.
[...]
Will that work for fix-2805.md to avoid a clone?

Also, absent a trace-through on Transcripts.hs, it looks like transcripts saved within transcripts-using-base/ should be getting that automatically! 🤔

So, this is what caused my confusion last week – my previous usage of transcripts-using-base/ worked because base.u included the bits of @unison/base that I needed. But this time I am using things that don’t exist in base.u, and I hadn’t realized that transcripts-using-base/ referred to base.u instead of @unison/base (which is somewhat intentional, since it’s meant to be a subset … but it’s not obvious without working through Transcripts.hs that this is what’s happening), so it was baffling that some definitions from @unison/base worked, while others didn’t.

I don’t need to manually do the load/add (and probably things that explicitly do it in idempotent/ should move to transcripts-using-base/?), but I need to either add a few definitions to base.u or change my transcript to use stuff that already exists in base.u.

And separately, renaming transcripts-using-base/_base.md/base.u to “prelude” or something instead could help avoid this confusion. As well as adding something to transcripts-using-base/README.md about what’s happening.

@aryairani
Copy link
Contributor

So, this is what caused my confusion last week – my previous usage of transcripts-using-base/ worked because base.u included the bits of @unison/base that I needed. But this time I am using things that don’t exist in base.u, and I hadn’t realized that transcripts-using-base/ referred to base.u instead of @unison/base (which is somewhat intentional, since it’s meant to be a subset … but it’s not obvious without working through Transcripts.hs that this is what’s happening), so it was baffling that some definitions from @unison/base worked, while others didn’t.

Gotcha, makes sense.

I don’t need to manually do the load/add (and probably things that explicitly do it in idempotent/ should move to transcripts-using-base/?), but I need to either add a few definitions to base.u or change my transcript to use stuff that already exists in base.u.

I think adding a few definitions to base.u would be ok, or we could update it with the source of a newer version of base, if that works? But really the whole current system is a mess IMO. The "better" (though not good) solution I think would be to have a CI step that preps and caches a codebase with install @unison/base and then fork that codebase for each of the transcripts-using-base/ transcripts (which could or not also be idempotent).

We also have #5543 ready to merge, though I'm not clear on whether this helps in this scenario or not. I'm thinking "no" until we are trying to cache multiple distinct libraries on a transcript-by-transcript basis.

And separately, renaming transcripts-using-base/_base.md/base.u to “prelude” or something instead could help avoid this confusion. As well as adding something to transcripts-using-base/README.md about what’s happening.

Yeah I'm thinking more just that we wait on this step until we come up with a way for the whole thing to go away.

Maybe when Share Sync v2 is finished, we'll have a setting to cache certain library bundles, which would let them be reused efficiently in CI.


I guess for now I'd say let's just modify base.u to include whatever you need, or to be a newer version of base, to avoid a slow clone during CI.

A side note, even typechecking base.u is slow, though not as slow as cloning, and it would be nice if we could skip that step too or achieve it faster differently.

It shows that we can’t pass numbers to `run`.

(This also touches more of `all-base-hashes.output.md` than I would have
expected, but it does add some definitions to base.u.)
- distinguish between “parameters” and “arguments” – a command has a
  fixed number of parameters, each of which maps to some number of
  arguments (from 0 to many, depending on the parameter)
- change the type of parameters to eliminate invalid parameter structures
- require `InputPattern` parameters to cover all arguments (there are a
  number of commands whose parameters were under-specified).
This allows us to incrementally expand numbered arguments, only doing so when
we want a Unison value and not unstructured text.

Fixes unisonweb#2805.
Now that the arguments to `parse` are universally checked, the
individual parsers don’t need to handle the cases as precisely.
@sellout sellout force-pushed the optional-numbered-args branch from 0365e54 to 421ecc3 Compare February 5, 2025 03:09
@sellout
Copy link
Contributor Author

sellout commented Feb 5, 2025

Ok, I updated the new test to not install a library, and instead add a few definitions to base.u.

This also fixes a few new/updated commands in the merge to use the new Parameters structure.

Copy link
Contributor

@aryairani aryairani Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The printLine output doesn't show up in the transcript output; I forget whether it's expected to? But in any case, is-is it doesn't show what we want. Maybe just return the value instead of printing it? Or does main have to return ()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the run output doesn’t get included in the transcript output yet. But yeah, I spaced on being able to return the Text. I’ll fix that. I comforted myself with the fact that the failures were errors, but it still didn’t feel right 😅

Copy link
Contributor

@aryairani aryairani Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yup.

We should fix #1172 too at some point, but only if it's pretty easy :-\
I could see it getting hairy for more complex handle-manipulation, but maybe it wouldn't?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ucm run : Numeric arguments are changed to find results
2 participants