Skip to content

Provide unconsN/unsnocN? #524

Open
@hasufell

Description

In an attempt to improve performance of filepath functions using ShortByteString I figured that unpack slowed down a couple of functions. Moving to several calls of uncons seemed to improve performance. In particular:

 readDriveUNC :: FILEPATH -> Maybe (FILEPATH, FILEPATH)
-readDriveUNC bs = case unpack bs of
-  (s1:s2:q:s3:xs)
-    | q == _question && L.all isPathSeparator [s1,s2,s3] ->
-      case L.map toUpper xs of
-          (u:n:c:s4:_)
-            | u == _U && n == _N && c == _C && isPathSeparator s4 ->
-              let (a,b) = readDriveShareName (pack (L.drop 4 xs))
-              in Just (pack (s1:s2:_question:s3:L.take 4 xs) <> a, b)
-          _ -> case readDriveLetter (pack xs) of
-                   -- Extended-length path.
-                   Just (a,b) -> Just (pack [s1,s2,_question,s3] <> a, b)
-                   Nothing -> Nothing
-  _ -> Nothing
+readDriveUNC bs
+  | Just (s1, r1) <- uncons bs
+  , Just (s2, r2) <- uncons r1
+  , Just (q,  r3) <- uncons r2
+  , Just (s3, xs) <- uncons r3
+  , q == _question
+  , L.all isPathSeparator [s1,s2,s3] =
+      if | Just (toUpper -> u, k1) <- uncons xs
+         , Just (toUpper -> n, k2) <- uncons k1
+         , Just (toUpper -> c, k3) <- uncons k2
+         , Just (s4,           rr) <- uncons k3
+         , u == _U
+         , n == _N
+         , c == _C
+         , isPathSeparator s4 ->
+              let (a,b) = readDriveShareName rr
+              in Just (pack [s1,s2,_question,s3,u,n,c,s4] <> a, b)
+         | otherwise -> case readDriveLetter xs of
+                          -- Extended-length path.
+                          Just (a,b) -> Just (pack [s1,s2,_question,s3] <> a, b)
+                          Nothing -> Nothing
+  | otherwise = Nothing
 

https://gitlab.haskell.org/haskell/filepath/-/merge_requests/116/diffs

The 3 consecutive calls to uncons are not only awkward, but also incur 3 copies for the tail.

So I'm wondering if a function like this might be useful (at least for ShortByteString):

unconsN :: Int -> ShortByteString -> Maybe ([Word8], ShortByteString)

The obvious disadvantage here is that you'll get partial pattern matching on the Word list, because we don't have dependent types.

Providing uncons2, uncons3 and using a tuple instead might be an alternative, but less general.

The other way would be to figure out why unpack is so slow. Afaiu it's only semi lazy, e.g. unpacks the first 100 bytes strictly.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions