Add structure to UCM command args #5480

sellout · 2024-12-05T01:43:04Z

Overview

This changes the UCM command parser quite a bit.

The args now have to match the parameter definitions

First, this renames some things to make the distinction between parameters and arguments provided to those parameters clearer. This is most apparent in the InputPattern.args to InputPattern.params change.

Previously, there were a lot of commands which omitted or had incorrect parameter definitions. This prevented completion from working for those, and just allowed things to get a bit out of sync. This is no longer possible. They now have to match.

But an InputPattern’s Parameters are now used to determine whether an argument is “structured” or not. Structured args have numbered arg substitution applied, while unstructured args do not. So the arguments to run are unstructured, meaning numbers can be passed as arguments now.

Also, the arguments are lexed with Unison’s lexer, which (among other things) handles quoted strings consistently. This now allows for file paths containing spaces, and arguments to other functions (run, create.author, etc. now consistently allow quoted strings)

Fixes #2805 and #5193.

Implementation notes

It creates a more rigid Parameters structure is then folded alongside a list of arguments. As each argument is processed, it is optionally (based on the individual Parameter) passed through numeric expansion. That can cause an argument to be replaced with multiple arguments (because of ranges), which then cascade to the later parameters.

The result of the fold is a pair of remaining parameters and the expanded arguments that have been processed. If there are unsatisfied required parameters, then it its handed off to fuzzy completion.

Interesting/controversial decisions

This uses the existing Unison lexer. It seems like a better approach might be to use some of the lexing functions in a new (simpler) lexer that also supports lexemes for projects, branches, etc.

Test coverage

The transcripts check a lot of this.

Loose ends

This is still a WIP. It mostly works, but there are a couple remaining rough edges, and the code needs to be cleaned up.

Right now, I think it requires quoting strings a bit more often than would be ideal (basically, anything that doesn’t form a valid Unison lexeme, so "@project/branch", versions like "1.2.3", and a couple other cases). The error reporting needs to be improved as well.

I mentioned having a more specific lexer for this as one bit of further work (which would eliminate the extra quoting). Alternatively, we could lex incrementally – with each ParameterType specifying whether the next argument should be lexed or just taken to the next space.

Another good change would be to move the argument handling functions from the individual command parsers to the ParameterType structure.

- distinguish between “parameters” and “arguments” – a command has a fixed number of parameters, each of which maps to some number of arguments (from 0 to many, depending on the parameter) - change the type of parameters to eliminate invalid parameter structures - require `InputPattern` parameters to cover all arguments (there are a number of commands whose parameters were under-specified).

This allows us to incrementally expand numbered arguments, only doing so when we want a Unison value and not unstructured text. Fixes unisonweb#2805.

Good: - handling of quoted strings is now solid (e.g., paths can now contain spaces, `create.author` now takes a single string, rather than processing the tail of the input) Bad: - have to quote some things we rather wouldn’t – e.g., project/branch refs with @ or / - still some breakage to iron out

sellout · 2024-12-05T01:43:39Z

I would recommend only looking at the transcript diffs for now – the code diffs are overly complicated and messy.

aryairani · 2024-12-06T03:53:00Z

Yeah IMO the quoted strings are generally going to be a dealbreaker. I think they should be optional on filenames, and probably absent from branches, version number. view "1-2" also feels bad.

I want to think about it some more though.

sellout · 2024-12-10T01:10:15Z

Yeah IMO the quoted strings are generally going to be a dealbreaker. I think they should be optional on filenames, and probably absent from branches, version number. view "1-2" also feels bad.

I agree. I think the solution is either

make a UCM-specific lexer that shares some lexemes with Unison or
try the parse one token, if it fails, just breakOn " ". So that gives us quoting, etc., but we end up with a spaceless String if something goes wrong.

Option 2 is a bit more cowboy, and there are issues with ranges. 1-4 will successfully lex as [Numeric 1, Numeric (-4)], so won’t fall back to a String.

I think both options have an issue with the find : syntax, because the following type is a series of lexemes that we then need to parse to get an actual type. But I haven’t tried to fix that.

Also, the first three commits here should get their own PR – they fix the “pass numbers to run” issue and have some other improvements to UCM command parameter definitions. I can get those cleaned up without worrying about the lexer bit of this.

sellout added 4 commits December 2, 2024 20:51

Add structured arg checking to ParameterType

fa506d2

This allows us to incrementally expand numbered arguments, only doing so when we want a Unison value and not unstructured text. Fixes unisonweb#2805.

Some changes to structured arg checking

8622d62

sellout requested a review from aryairani December 5, 2024 01:43

sellout mentioned this pull request Jan 17, 2025

Allow UCM commands to have “unprocessed” arguments #5549

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add structure to UCM command args #5480

Add structure to UCM command args #5480

sellout commented Dec 5, 2024

sellout commented Dec 5, 2024

aryairani commented Dec 6, 2024

sellout commented Dec 10, 2024

Add structure to UCM command args #5480

Are you sure you want to change the base?

Add structure to UCM command args #5480

Conversation

sellout commented Dec 5, 2024

Overview

The args now have to match the parameter definitions

Implementation notes

Interesting/controversial decisions

Test coverage

Loose ends

sellout commented Dec 5, 2024

aryairani commented Dec 6, 2024

sellout commented Dec 10, 2024