Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hexadecimal, Octal and Binary syntax #7695

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

shikanime
Copy link

@shikanime shikanime commented Jan 25, 2023

Motivation

Add support for integer literal syntax such as

  • 0bXXXX for binary;
  • 0xXXXX for hexadecimal;
  • 0oXXXX for octal;

Context

         🌸> ' ' フ
         | . _  _l     < I'm just a regular person on the internet who doesn't have a
        /` ミ_xノ         lot of expertise in the Nix source code, but I saw a post about
       / .     |            someone wanting this certain feature and thought I'd chime in! 🌸
      /   ヽ . ノ
     │ . |  |  |
 / ̄| .  |  |  |
 | ( ̄ヽ__ヽ_)__)
 \二つ

fixes #7578

I don't know where to write the tests for the parser/lexer, feel free to comment on this!

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests
  • documentation in the manual
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or bug fix: updated release notes

@@ -113,6 +113,9 @@ static StringToken unescapeStr(SymbolTable & symbols, char * s, size_t length)
ANY .|\n
ID [a-zA-Z\_][a-zA-Z0-9\_\'\-]*
INT [0-9]+
BIN 0b[01]+
OCT 0o[0-7]+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change INT to [1-9][0-9]* and OCT to 0[0-7]*, so octal numbers start with 0 which is IMHO more common. On the other hand, this would break padding with leading zeros, but this looks better with spaces anyway IMHO. This would make 0 an octal 0, but it is still zero. The letter o is quite similar to 0, so 0o77 and 0077 look very similar and one might expect them to be both octal values, so at least add 0[0-7]+ as an alternative to 0o[0-7]+.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should break 010 == 10, as Nix is meant to be a very stable language. It does seem like a good idea to warn about zero-padded integers, as the existence of octal support makes them quite ambiguous.
Aesthetics are of secondary importance, and explicit is good.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely deprecate 010 == 10, as numbers starting with 0 are usually read as octals for historic reasons. Perhaps a warning like »Deprecated syntax: If you intended to zero-pad a decimal value, consider to use spaces instead. If you meant to write an octal value, write 0o<value>. Values starting with 0 are ambiguous.«.

Another issue: Binary values tend to be quite long, so like in python, allowing _ to group e.g. bytes might be beneficial for readability, for decimals (e.g. multiple of 1000) or hexadecimals (e.g. words/dwords) and perhaps for floating point literals as well.

Co-authored-by: Robert Hensing <[email protected]>
tests/eval.sh Show resolved Hide resolved
@roberth
Copy link
Member

roberth commented Feb 21, 2023

I think we can potentially unblock this PR by running an experiment:

  • Add a deprecation warning to the syntax we'd like to steal ([ <number><identifier> ] - a list with at least these two consecutive items, without a separating space)
  • The warning refers to the issue, and asks users to report back if the change is a problem for them.

I think this should be discussed with the Nix team.
Question for the team: are we ok with running a deprecation experiment like this?

Click for possible future steps (not necessary to discuss yet; overly detailed)

Follow-up steps:

  1. Change this PR to put the syntax change behind a feature flag

    • If the feature flag is enabled, use the new syntax, as simple as that
    • If the feature flag is not enabled, use the old syntax, but print a deprecation warning such as:
      Future ambiguous syntax found in file.nix:123, 0x100 We plan to reinterpret this syntax as a hexadecimal number in an upcoming release. We've estimated that this character sequence was highly unlikely to occur in practice, so if you would like for this file to be reproducible in future versions of Nix, please report back to us at .
  2. Wait for reports. 1 year?

  3. If nothing is reported, just steal the syntax. If something is reported, see if we can work around the issue by checking whether such a variable is in scope. Otherwise, this is blocked by a language version declaration feature, or we'd have to reconsider our backward compatibility policy

@fricklerhandwerk fricklerhandwerk added language The Nix expression language; parser, interpreter, primops, evaluation, etc feature Feature request or proposal labels Feb 24, 2023
@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Feb 24, 2023

Triaged in the Nix team meeting:

  • @thufschmitt: technically this is a breaking change
    • incidentally @yorickvP tried to address a very similar problem and stopped because of that, then started to work on an RFC to deal with breaking changes
    • @Ericson2314: if we had proper language versioning it wouldn't be so much of an issue
    • there would be some use cases for this syntax, seen this multiple times
  • @edolstra: this is nice to have, but not sure if there is a huge demand for non-decimal notation
  • @Ericson2314: intermediate goals (figuring out the breaking change story, and gaining experience as a team) are probably more valuable than the feature itself
  • agreement; postponed to after we deal with the overarching issue of breaking changes

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-02-24-nix-team-meeting-minutes-35/25757/1

@roberth roberth added the breaking Changes we can't make without breaking old expressions, changing hashes, etc label Jun 1, 2023
@nyabinary
Copy link

Triaged in the Nix team meeting:

* @thufschmitt: technically this is a breaking change
  
  * incidentally @yorickvP tried to address a very similar problem and stopped because of that, then started to work on an RFC to deal with breaking changes
  * @Ericson2314: if we had proper language versioning it wouldn't be so much of an issue
  * there would be some use cases for this syntax, seen this multiple times

* @edolstra: this is nice to have, but not sure if there is a huge demand for non-decimal notation
  
  * @fricklerhandwerk: depends on the audience you ask

* @Ericson2314: intermediate goals (figuring out the breaking change story, and gaining experience as a team) are probably more valuable than the feature itself

* agreement; postponed to after we deal with the overarching issue of breaking changes

Any update about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Changes we can't make without breaking old expressions, changing hashes, etc feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support hexadecimal and octal syntax for integers
6 participants