-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added parser for /proc/[pid]/environ
#28
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the API - what do you think about modeling it after VarOs
, an iterator of (OsString, OsString)
?
src/pid/environ.rs
Outdated
/// Parses the provided environ file. | ||
fn environ_path<P: AsRef<Path>>(path: P) -> Result<BTreeMap<Vec<u8>, Vec<u8>>> { | ||
let file_size = metadata(&path)?.len(); | ||
let mut buf = vec![0u8; file_size as usize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is racy with updates to the env, right? I think it would be better and simpler just to use File::read_to_end
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From man 5 proc
:
If, after an execve(2), the process modifies its environment (e.g., by calling functions such as putenv(3) or modifying the environ(7) variable directly), this file will not reflect those changes.
So the only race condition I can think of is the process is closed in between getting the size and reading the file.. But still you are right, will fix.
I totally forgot about |
Updated the patch. Unfortunately |
Sorry for not being clear, I was suggesting creating a new type like |
actually in terms of naming I think either of these would be fine:
each has advantages, what do you think? |
Ah, np. Personally I would rather go with |
Environ sounds good. In terms of struct vs typedef I think the only option
will be struct since you can typedef a type containing a closure, but if
you have a trick in unaware of to make that possible then go for it (but no
boxing please).
…On Sat, Aug 26, 2017 at 5:52 PM Denis ***@***.***> wrote:
Ah, np. Personally I would rather go with procinfo::pid::Environ since it
sounds easier to guess what it is (IMHO), but I'd say the choice is yours.
Btw, regardless of the naming, do you want it to be a full-featured struct
or just a typedef? I don't see any advantages with a separate class but
there might be some I just can't think of.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#28 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAJx8miZMi9cA7KZawmTnH2J9KnOsnKTks5scL3FgaJpZM4PDooe>
.
|
Can't *
…On Sat, Aug 26, 2017 at 6:03 PM Dan Burkert ***@***.***> wrote:
Environ sounds good. In terms of struct vs typedef I think the only option
will be struct since you can typedef a type containing a closure, but if
you have a trick in unaware of to make that possible then go for it (but no
boxing please).
On Sat, Aug 26, 2017 at 5:52 PM Denis ***@***.***> wrote:
> Ah, np. Personally I would rather go with procinfo::pid::Environ since
> it sounds easier to guess what it is (IMHO), but I'd say the choice is
> yours.
> Btw, regardless of the naming, do you want it to be a full-featured
> struct or just a typedef? I don't see any advantages with a separate
> class but there might be some I just can't think of.
>
> —
> You are receiving this because you commented.
>
>
> Reply to this email directly, view it on GitHub
> <#28 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAJx8miZMi9cA7KZawmTnH2J9KnOsnKTks5scL3FgaJpZM4PDooe>
> .
>
|
Well, I though about a simple |
On second thought that was gibberish. Up to you on typedef vs struct.
…On Sat, Aug 26, 2017 at 6:03 PM Dan Burkert ***@***.***> wrote:
Can't *
On Sat, Aug 26, 2017 at 6:03 PM Dan Burkert ***@***.***> wrote:
> Environ sounds good. In terms of struct vs typedef I think the only
> option will be struct since you can typedef a type containing a closure,
> but if you have a trick in unaware of to make that possible then go for it
> (but no boxing please).
>
> On Sat, Aug 26, 2017 at 5:52 PM Denis ***@***.***> wrote:
>
>> Ah, np. Personally I would rather go with procinfo::pid::Environ since
>> it sounds easier to guess what it is (IMHO), but I'd say the choice is
>> yours.
>> Btw, regardless of the naming, do you want it to be a full-featured
>> struct or just a typedef? I don't see any advantages with a separate
>> class but there might be some I just can't think of.
>>
>> —
>> You are receiving this because you commented.
>>
>>
>> Reply to this email directly, view it on GitHub
>> <#28 (comment)>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/AAJx8miZMi9cA7KZawmTnH2J9KnOsnKTks5scL3FgaJpZM4PDooe>
>> .
>>
>
|
Anything else? :) |
src/pid/environ.rs
Outdated
use libc::pid_t; | ||
|
||
/// A list of environment variables and their values. | ||
pub type Environ = Vec<(OsString, OsString)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping there was a way to avoid collecting into an intermediate Vec
, but I spent a while on it and didn't find an elegant way. Since the API's just exposing a vec, I think we should skip the type def and just return Vec<(OsString, OsString)>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I'll think a bit more on this.
src/pid/environ.rs
Outdated
// Parse 'key=value" pair. | ||
named!( | ||
pair<&[u8], (&[u8], &[u8])>, | ||
tuple!(take_until_and_consume!("="), take_until!("\0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to have different behavior that the standard library when the key contains a leading '=' character (https://github.com/rust-lang/rust/blob/1.19.0/src/libstd/sys/unix/os.rs#L405-L408). Can this skip a character first to avoid this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please add a test case for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure where did they get that "strategy" from. Let's say setenv
function explicitly checks that the name
doesn't have an equal sign inside: https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/stdlib/setenv.c#L301
Also https://stackoverflow.com/a/2821183/1449426 has nice answers with links and everything on the matter.
While it's definitely possible to support such a behaviour it will make the code to be more complex, and my humble opinion is that one should avoid unnecessary code complication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nonetheless it appears to be a real issue. On my up-to-date linux machine (kernel 4.11) the output of env =FOO=BAR xxd /proc/self/environ
shows that leading =
characters can appear.
src/pid/environ.rs
Outdated
); | ||
multi(input).map(|r| { | ||
r.into_iter() | ||
.map(|(x, y)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move this map call into the pair
function, the number of vectors constructed can be reduced from two to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mm, I have there a tuple of two values. How can I go through them with a single call of the map
? Although it definitely makes sense to apply the map
right when the values are obtained (it makes the code to be more clear indeed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The full code context has disappeared (thanks GitHub), but I think the idea was that you could use a map!
call inside the parsing code, instead of the Iterator::map
call here.
Updated the patch again. This time I used the streams ( |
Oh man that old rust compiler makes me sick |
5e4d63d
to
4f09397
Compare
Not sure what you mean by this, I was debating whether to ask you just to copy that code wholesale. The
The new code is allocating more vectors than the previous version, one for each entry. |
Yeah, you're totally right, I've overlooked it somehow :( Okay, seems like I have to put some (probably a lot of) thought into the matter. |
As I mentioned, I thought your previous cut was very close. Here's the change from /// Parses the provided buffer.
fn parse_environ(input: &[u8]) -> IResult<&[u8], Vec<(OsString, OsString)>> {
// Parse 'key=value" pair.
named!(pair<&[u8], (OsString, OsString)>,
map!(tuple!(take_until_and_consume!("="), take_until!("\0")),
|(x, y): (&[u8], &[u8])| {
(OsString::from(OsStr::from_bytes(x)), OsString::from(OsStr::from_bytes(y)))
}));
terminated!(input,
separated_list!(tag!("\0"), pair),
tag!("\0")
)
} That brings it down from two to one vector allocated. Unfortunately this still isn't covering the leading |
src/pid/environ.rs
Outdated
match pair(&self.data) { | ||
IResult::Done(leftover, parsed) => Some(Ok((leftover.to_vec(), parsed))), | ||
/// An iterator over environment variables. | ||
pub struct EnvironIter<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in doubt about naming and whether the structure should be exported to a top-level (pid::EnvironIter
) or not (I think not).
src/pid/environ.rs
Outdated
// Calculate position of the equal sign. | ||
let pos = match src.iter().skip(1).position(|c| c == &b'=') { | ||
Some(p) => p, | ||
None => return IResult::Error(error_position!(nom::ErrorKind::Custom(0), src)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fishy part, not sure which error to return.
src/pid/environ.rs
Outdated
assert!(next.is_some()); | ||
let next = next.unwrap(); | ||
assert!(next.is_ok()); | ||
let next = next.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I was doing this unwraps w/o is_*
checks and it ended up in a mess when an actual error has happened.
Pls have a look at yet another version. Sorry for keeping you busy with this. |
Hi Dan, Also I'd like to make things clear: do you want me to go with allocating a vector that contains all the pairs in one go, or do you like the latest streaming approach? I would rather go with iterators since it's the most idiomatic way as for me. I think I fixed all the extra allocations and manipulations so it should be fine given that I didn't miss anything this time. |
Since so much time has passed you've probably forgotten how did it start so I squashed everything together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left comments assuming an Iterator based API. I still think it may be better just to use a vector API, since that way the parsing can be done up front, and a single error returned, instead of potentially returning an error for every pair.
src/pid/environ.rs
Outdated
} | ||
|
||
/// An iterator over environment variables. | ||
pub struct EnvironIter<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can Environ
and EnvironIter
be combined into a single type? I think it should be possible just using a Vec<u8>
data field and a usize
offset.
src/pid/environ.rs
Outdated
type Item = Result<(OsString, OsString)>; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if self.data.len() == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer is_empty
.
src/pid/environ.rs
Outdated
|
||
|
||
#[test] | ||
fn test_iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap all of the tests in a #[cfg(test)] mod tests {
like other files, and put them all at the end of the file.
src/pid/environ.rs
Outdated
/// Executes `take_until_and_consume` and maps its result to OsString. | ||
named_args!( | ||
until<'a>(sep: &'a str)<OsString>, | ||
map!(take_until_and_consume!(sep), from_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the body here is simple enough just to inline into pair
.
Sorry, I totally forgot about this pr. I think I will reiterate it once again in a couple of days considering all the review comments. |
I've kept I've also put some examples in the test module that show how to easily collect all the name-value pairs into different collections (without having to handle errors for every given pair). |
Hi @danburkert ! What do you think about my edits? Should I make some further amendments/improvements? |
Hi!
First of all, thanks for a great and simple library!
I though it'd be nice to contribute to the project so I found out that there is no parser for a
/proc/[pid]/environ
file and implemented it :)Please consider to review & merge if it's okay.