Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce monomorphization output by using more specialized code paths for deserializers #360

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Alexis211
Copy link

This PR tries to reduce the size of deserializers generated by rmp-serde by avoiding matching against irrelevant markers, instead using a specialized match expression that is generated using a macro to only handle the relevant possibilities.

Filtered output of cargo llvm-lines on one of the biggest crates of Garage, before this change:

     3456 (0.3%, 45.3%)     72 (0.2%, 26.4%)  tokio::runtime::task::core::Core<T,S>::set_stage
     3461 (0.3%, 45.0%)     63 (0.2%, 26.1%)  <rmp_serde::decode::MapAccess<R,C> as serde::de::MapAccess>::next_key_seed
     3816 (0.3%, 44.7%)     72 (0.2%, 25.9%)  tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
--
     4509 (0.4%, 40.2%)      9 (0.0%, 22.1%)  garage_net::endpoint::Endpoint<M,H>::call_streaming::{{closure}}
     4639 (0.4%, 39.8%)     63 (0.2%, 22.1%)  <rmp_serde::decode::SeqAccess<R,C> as serde::de::SeqAccess>::next_element_seed
     4644 (0.4%, 39.4%)     36 (0.1%, 21.9%)  tokio::runtime::scheduler::current_thread::Handle::spawn
--
     5826 (0.5%, 35.0%)    255 (0.8%, 17.1%)  core::result::Result<T,E>::map
     5916 (0.5%, 34.5%)    102 (0.3%, 16.3%)  <&mut rmp_serde::encode::Serializer<W,C> as serde::ser::Serializer>::serialize_newtype_variant
     6379 (0.5%, 34.0%)     94 (0.3%, 15.9%)  core::iter::traits::iterator::Iterator::try_fold
--
    14904 (1.3%, 24.3%)    432 (1.4%,  3.3%)  std::panic::catch_unwind
    30468 (2.6%, 23.1%)    196 (0.6%,  1.9%)  rmp_serde::decode::read_str_data
   103382 (8.8%, 20.5%)    197 (0.6%,  1.3%)  rmp_serde::decode::any_num
   138318 (11.7%, 11.7%)   196 (0.6%,  0.6%)  rmp_serde::decode::Deserializer<R,C>::any_inner
  1180351                31016                (TOTAL)
  -----                 ------               -------------
  Lines                 Copies               Function name

and after this change:

    3212 (0.4%, 33.3%)      1 (0.0%, 22.8%)  garage_api_admin::router_v2::<impl garage_api_admin::api::AdminApiRequest>::from_request::{{closure}}
    3216 (0.4%, 33.0%)     16 (0.1%, 22.8%)  <&mut rmp_serde::encode::Serializer<W,C> as serde::ser::Serializer>::collect_seq
    3240 (0.4%, 32.6%)     72 (0.3%, 22.8%)  tokio::runtime::task::harness::Harness<T,S>::release
--
    3456 (0.4%, 31.2%)     72 (0.3%, 22.1%)  tokio::runtime::task::core::Core<T,S>::set_stage
    3461 (0.4%, 30.8%)     63 (0.2%, 21.9%)  <rmp_serde::decode::MapAccess<R,C> as serde::de::MapAccess>::next_key_seed
    3522 (0.4%, 30.4%)     15 (0.1%, 21.6%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_map
    3816 (0.4%, 30.0%)     72 (0.3%, 21.6%)  tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
--
    3908 (0.4%, 29.2%)     39 (0.1%, 20.7%)  alloc::vec::Vec<T,A>::extend_desugared
    3972 (0.4%, 28.8%)     12 (0.0%, 20.6%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_seq
    3999 (0.4%, 28.3%)     30 (0.1%, 20.5%)  <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::fold
--
    4509 (0.5%, 25.6%)      9 (0.0%, 19.3%)  garage_net::endpoint::Endpoint<M,H>::call_streaming::{{closure}}
    4639 (0.5%, 25.1%)     63 (0.2%, 19.3%)  <rmp_serde::decode::SeqAccess<R,C> as serde::de::SeqAccess>::next_element_seed
    4644 (0.5%, 24.6%)     36 (0.1%, 19.1%)  tokio::runtime::scheduler::current_thread::Handle::spawn
--
    5402 (0.6%, 21.9%)    235 (0.9%, 17.5%)  alloc::boxed::Box<T>::new
    5916 (0.6%, 21.3%)    102 (0.4%, 16.6%)  <&mut rmp_serde::encode::Serializer<W,C> as serde::ser::Serializer>::serialize_newtype_variant
    6212 (0.7%, 20.7%)    767 (2.9%, 16.2%)  <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
--
   11149 (1.2%, 11.6%)    362 (1.4%,  6.0%)  tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
   11539 (1.3%, 10.4%)     73 (0.3%,  4.7%)  rmp_serde::decode::read_str_data
   12427 (1.4%,  9.1%)    571 (2.2%,  4.4%)  <core::result::Result<T,E> as core::ops::try_trait::Try>::branch
   14904 (1.6%,  7.7%)    432 (1.6%,  2.2%)  std::panic::catch_unwind
   25430 (2.8%,  6.1%)     72 (0.3%,  0.6%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_identifier
   30260 (3.3%,  3.3%)     75 (0.3%,  0.3%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_struct
  911552                26229                (TOTAL)
  -----                 ------               -------------
  Lines                 Copies               Function name

As you can see, this reduces the size of the LLVM IR by 268799 lines, or 22.7% of all code generated by this crate.

In this example, most of the serialization and deserialization routines correspond to (de)serializing structs defined in this file

@s-nie
Copy link

s-nie commented Feb 7, 2025

Neat! In my use case this does better than #350 with 3436229 (vs. 3632031) LLVM lines and 59s (vs. 65s) compile time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants