Description
In an attempt to improve performance of filepath functions using ShortByteString I figured that unpack
slowed down a couple of functions. Moving to several calls of uncons seemed to improve performance. In particular:
readDriveUNC :: FILEPATH -> Maybe (FILEPATH, FILEPATH)
-readDriveUNC bs = case unpack bs of
- (s1:s2:q:s3:xs)
- | q == _question && L.all isPathSeparator [s1,s2,s3] ->
- case L.map toUpper xs of
- (u:n:c:s4:_)
- | u == _U && n == _N && c == _C && isPathSeparator s4 ->
- let (a,b) = readDriveShareName (pack (L.drop 4 xs))
- in Just (pack (s1:s2:_question:s3:L.take 4 xs) <> a, b)
- _ -> case readDriveLetter (pack xs) of
- -- Extended-length path.
- Just (a,b) -> Just (pack [s1,s2,_question,s3] <> a, b)
- Nothing -> Nothing
- _ -> Nothing
+readDriveUNC bs
+ | Just (s1, r1) <- uncons bs
+ , Just (s2, r2) <- uncons r1
+ , Just (q, r3) <- uncons r2
+ , Just (s3, xs) <- uncons r3
+ , q == _question
+ , L.all isPathSeparator [s1,s2,s3] =
+ if | Just (toUpper -> u, k1) <- uncons xs
+ , Just (toUpper -> n, k2) <- uncons k1
+ , Just (toUpper -> c, k3) <- uncons k2
+ , Just (s4, rr) <- uncons k3
+ , u == _U
+ , n == _N
+ , c == _C
+ , isPathSeparator s4 ->
+ let (a,b) = readDriveShareName rr
+ in Just (pack [s1,s2,_question,s3,u,n,c,s4] <> a, b)
+ | otherwise -> case readDriveLetter xs of
+ -- Extended-length path.
+ Just (a,b) -> Just (pack [s1,s2,_question,s3] <> a, b)
+ Nothing -> Nothing
+ | otherwise = Nothing
https://gitlab.haskell.org/haskell/filepath/-/merge_requests/116/diffs
The 3 consecutive calls to uncons are not only awkward, but also incur 3 copies for the tail.
So I'm wondering if a function like this might be useful (at least for ShortByteString
):
unconsN :: Int -> ShortByteString -> Maybe ([Word8], ShortByteString)
The obvious disadvantage here is that you'll get partial pattern matching on the Word list, because we don't have dependent types.
Providing uncons2
, uncons3
and using a tuple instead might be an alternative, but less general.
The other way would be to figure out why unpack is so slow. Afaiu it's only semi lazy, e.g. unpacks the first 100 bytes strictly.