Websocket masking could a lot faster

The current masking code is fairly naive and masks a byte at a time. But pretty much everyone has 64-bit operations (and some even have 128-bit SIMD operations).  

It's possible that optimizers are good enough to mask this efficiently already,  but from what I observe on godbolt, even under -O3 both clang and gcc don't optimize this well - at least for x86. 

The interesting concerns here will be dealing with misaligned data, which is not an issue on x86, and for modern ARM (aarch64) is usually not an issue.

Fixing this would be substantial for high bandwidth Websocket messages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Websocket masking could a lot faster #1801

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development