Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hexadecimal, Octal and Binary syntax #7695

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions src/libexpr/lexer.l
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ static StringToken unescapeStr(SymbolTable & symbols, char * s, size_t length)
ANY .|\n
ID [a-zA-Z\_][a-zA-Z0-9\_\'\-]*
INT [0-9]+
BIN 0b[01]+
OCT 0o[0-7]+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change INT to [1-9][0-9]* and OCT to 0[0-7]*, so octal numbers start with 0 which is IMHO more common. On the other hand, this would break padding with leading zeros, but this looks better with spaces anyway IMHO. This would make 0 an octal 0, but it is still zero. The letter o is quite similar to 0, so 0o77 and 0077 look very similar and one might expect them to be both octal values, so at least add 0[0-7]+ as an alternative to 0o[0-7]+.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should break 010 == 10, as Nix is meant to be a very stable language. It does seem like a good idea to warn about zero-padded integers, as the existence of octal support makes them quite ambiguous.
Aesthetics are of secondary importance, and explicit is good.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely deprecate 010 == 10, as numbers starting with 0 are usually read as octals for historic reasons. Perhaps a warning like »Deprecated syntax: If you intended to zero-pad a decimal value, consider to use spaces instead. If you meant to write an octal value, write 0o<value>. Values starting with 0 are ambiguous.«.

Another issue: Binary values tend to be quite long, so like in python, allowing _ to group e.g. bytes might be beneficial for readability, for decimals (e.g. multiple of 1000) or hexadecimals (e.g. words/dwords) and perhaps for floating point literals as well.

HEX 0x[0-9a-fA-F]+
FLOAT (([1-9][0-9]*\.[0-9]*)|(0?\.[0-9]+))([Ee][+-]?[0-9]+)?
PATH_CHAR [a-zA-Z0-9\.\_\-\+]
PATH {PATH_CHAR}*(\/{PATH_CHAR}+)+\/?
Expand Down Expand Up @@ -160,6 +163,33 @@ or { return OR_KW; }
}
return INT;
}
{BIN} { errno = 0;
yylval->n = strtoll(yytext + 2, 0, 2);
if (errno != 0)
throw ParseError({
.msg = hintfmt("invalid binary integer '%1%'", yytext),
.errPos = data->state.positions[CUR_POS],
});
shikanime marked this conversation as resolved.
Show resolved Hide resolved
return BIN;
}
{OCT} { errno = 0;
yylval->n = strtoll(yytext + 2, 0, 8);
if (errno != 0)
throw ParseError({
.msg = hintfmt("invalid octal integer '%1%'", yytext),
.errPos = data->state.positions[CUR_POS],
});
return OCT;
}
{HEX} { errno = 0;
yylval->n = strtoll(yytext + 2, 0, 16);
if (errno != 0)
throw ParseError({
.msg = hintfmt("invalid hexadecimal integer '%1%'", yytext),
.errPos = data->state.positions[CUR_POS],
});
return HEX;
}
{FLOAT} { errno = 0;
yylval->nf = strtod(yytext, 0);
if (errno != 0)
Expand Down
5 changes: 4 additions & 1 deletion src/libexpr/parser.y
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,7 @@ void yyerror(YYLTYPE * loc, yyscan_t scanner, ParseData * data, const char * err
%type <id> attr
%token <id> ID ATTRPATH
%token <str> STR IND_STR
%token <n> INT
%token <n> INT BIN OCT HEX
%token <nf> FLOAT
%token <path> PATH HPATH SPATH PATH_END
%token <uri> URI
Expand Down Expand Up @@ -449,6 +449,9 @@ expr_simple
else
$$ = new ExprVar(CUR_POS, data->symbols.create($1));
}
| BIN { $$ = new ExprInt($1); }
| OCT { $$ = new ExprInt($1); }
| HEX { $$ = new ExprInt($1); }
| INT { $$ = new ExprInt($1); }
| FLOAT { $$ = new ExprFloat($1); }
| '"' string_parts '"' { $$ = $2; }
Expand Down
3 changes: 3 additions & 0 deletions tests/eval.nix
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
{
int = 123;
bin = 0b1111011;
shikanime marked this conversation as resolved.
Show resolved Hide resolved
hex = 0x7b;
oct = 0o173;
str = "foo";
attr.foo = "bar";
}
3 changes: 3 additions & 0 deletions tests/eval.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ EOF
nix eval --expr 'assert 1 + 2 == 3; true'

[[ $(nix eval int -f "./eval.nix") == 123 ]]
shikanime marked this conversation as resolved.
Show resolved Hide resolved
[[ $(nix eval bin -f "./eval.nix") == 123 ]]
[[ $(nix eval hex -f "./eval.nix") == 123 ]]
[[ $(nix eval oct -f "./eval.nix") == 123 ]]
[[ $(nix eval str -f "./eval.nix") == '"foo"' ]]
[[ $(nix eval str --raw -f "./eval.nix") == 'foo' ]]
[[ $(nix eval attr -f "./eval.nix") == '{ foo = "bar"; }' ]]
Expand Down