zip-2.0.1/0000755000000000000000000000000007346545000010505 5ustar0000000000000000zip-2.0.1/CHANGELOG.md0000644000000000000000000001135307346545000012321 0ustar0000000000000000## Zip 2.0.1 * Fixed corruption of large entries when zip64 is used. [Issue 111](https://github.com/mrkkrp/zip/issues/111). ## Zip 2.0.0 * Unified `BZip2Unsupported` and `ZstdUnsupported` into a single data constructor `UnsupportedCompressionMethod` with a `CompressionMethod` field. ## Zip 1.7.2 * Now the ZIP64 extra field is only written when it is necessary. Previously it was written unconditionally and it confused some tools. ## Zip 1.7.1 * Fixed compilation with zstd and/or bzip2 disabled. ## Zip 1.7.0 * Set user permissions on linux platform as follows: if an existing file is added, use its permissions; if an entry is generated from a bytestring or a stream, use 0600. This behavior mimics the zip utility. ## Zip 1.6.0 * Added support for Zstandard (zstd) compression * Added a Cabal flag `-fdisable-zstd` to remove the zstd C library dependency and hence support for Zstd entries. ## Zip 1.5.0 * Added the `packDirRecur'` function. * Dropped support for GHC 8.4. ## Zip 1.4.1 * Fixed the build on Mac. ## Zip 1.4.0 * The “version made by” info inside archive now correctly sets Unix as the OS that produced the archive when the library is compiled on Unix. This allows other utilities such as `unzip` to read and correctly restore file permissions. [Issue 62](https://github.com/mrkkrp/zip/issues/62). * Added the `Codec.Archive.Zip.Unix` module. ## Zip 1.3.2 * Fix a bug where removing a temporary file failed in the prescence of async exceptions. ## Zip 1.3.1 * The test suite is now faster. ## Zip 1.3.0 * Dropped support for GHC 8.2 and older. * Added a Cabal flag `-fdisable-bzip2` to remove the bzip2 C library dependency and hence support for BZip2 entries. ## Zip 1.2.0 * Added the `setExternalFileAttrs` function and the `edExternalFileAttrs` field in the `EntryDescription` record. ## Zip 1.1.0 * Made `saveEntry` and `unpackInto` restore modification time of files. ## Zip 1.0.0 * Works with `conduit-1.3.0`, `conduit-extra-1.3.0`, `resourcet-1.2.0` and `bzlib-conduit-0.3.0`. * Stop depending on `path`, `path-io`, and `plub-b`. * Made the module `Codec.Archive.Zip.Type` non-public. * Remove derived instances of `Data` and `Generic` for `EntrySelector` not to expose its inner structure. * Change signature of the `loadEntry` function, its second argument is now just `EntrySelector` of the entry to add. * The second argument of `packDirRecur` now receives paths that are relative to the root of the directory we pack. ## Zip 0.2.0 * Added `MonadBase` and `MonadBaseControl` instances for the `ZipArchive` monad. Also exported the `ZipState` type without revealing its data constructor and records. * Dropped `MonadThrow` and `MonadCatch` constraints for `createArchive` and `withArchive`. ## Zip 0.1.11 * Minor refactoring. * Improved documentation and metadata. ## Zip 0.1.10 * Made `getEntrySource` polymorphic in terms of the `Source` it returns. * Numerous cosmetic corrections to the docs. * Derived `Eq` and `Ord` for `EntrySelectorException` and `ZipException`. ## Zip 0.1.9 * Fixed a bug with modification time serialization on 32 bit systems. ## Zip 0.1.8 * Fixed a bug that caused `zip` to write incorrect number of entries (instead of `0xffff`) in central directory when Zip64 feature is enabled. ## Zip 0.1.7 * Fix literal overflows on 32 bit systems. ## Zip 0.1.6 * Allowed `time-1.7`. * Fixed an issue when empty archives with Zip 64 feature enabled could not be read (the “Parsing of archive structure failed: Cannot locate end of central directory”). ## Zip 0.1.5 * Switched to using `withBinaryFile` instead of `withFile`, because the latter does nasty conversions on Windows, see docs for `openBinaryFile`. ## Zip 0.1.4 * Added several simple code examples in `Codec.Archive.Zip`. * Derived `Typeable`, `Data`, `Generic` for `EntrySelector`. * Derived `Typeable` for `EntryDescription`. * Derived `Show`, `Ord`, `Bounded`, `Data`, and `Typeable` for `CompressionMethod`. * Derived `Read`, `Ord`, `Typeable`, and `Data` for `ArchiveDescription`. ## Zip 0.1.3 * Improved speed of detection of invalid archives. * Introduced `getEntrySource` function. ## Zip 0.1.2 * Relaxed dependency on `semigroups`. * Added explicit check of “version needed to extract”, so if archive uses some advanced features that we do not support yet, parsing fails. * Value of “version needed to extract” field is now calculated dynamically with respect to actually used features, e.g. if you just store or deflate a not very big file, `2.0` version will be written (previously we wrote `4.6` unconditionally). This is needed to avoid scaring tools that can only handle basic Zip archives. ## Zip 0.1.1 * Make decoding of CP437 faster. ## Zip 0.1.0 * Initial release. zip-2.0.1/Codec/Archive/0000755000000000000000000000000007346545000013103 5ustar0000000000000000zip-2.0.1/Codec/Archive/Zip.hs0000644000000000000000000005430407346545000014207 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE LambdaCase #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE TupleSections #-} {-# LANGUAGE TypeFamilies #-} -- | -- Module : Codec.Archive.Zip -- Copyright : © 2016–present Mark Karpov -- License : BSD 3 clause -- -- Maintainer : Mark Karpov -- Stability : experimental -- Portability : portable -- -- The module provides everything you may need to manipulate Zip archives. -- There are three things that should be clarified right away, to avoid -- confusion. -- -- First, we use the 'EntrySelector' type that can be obtained from relative -- 'FilePath's (paths to directories are not allowed). This method may seem -- awkward at first, but it will protect you from the problems with -- portability when your archive is unpacked on a different platform. -- -- Second, there is no way to add directories, or to be precise, /empty -- directories/ to your archive. This approach is used in Git, and I find it -- sane. -- -- Finally, the third feature of the library is that it does not modify -- archive instantly, because doing so on every manipulation would often be -- inefficient. Instead, we maintain a collection of pending actions that -- can be turned into an optimized procedure that efficiently modifies the -- archive in one pass. Normally, this should be of no concern to you, -- because all actions are performed automatically when you leave the -- 'ZipArchive' monad. If, however, you ever need to force an update, the -- 'commit' function is your friend. -- -- === Examples -- -- An example of a program that prints a list of archive entries: -- -- > import Codec.Archive.Zip -- > import System.Environment (getArgs) -- > import qualified Data.Map as M -- > -- > main :: IO () -- > main = do -- > [path] <- getArgs -- > entries <- withArchive path (M.keys <$> getEntries) -- > mapM_ print entries -- -- Create a Zip archive with a “Hello World” file: -- -- > import Codec.Archive.Zip -- > import System.Environment (getArgs) -- > -- > main :: IO () -- > main = do -- > [path] <- getArgs -- > s <- mkEntrySelector "hello-world.txt" -- > createArchive path (addEntry Store "Hello, World!" s) -- -- Extract contents of a file and print them: -- -- > import Codec.Archive.Zip -- > import System.Environment (getArgs) -- > import qualified Data.ByteString.Char8 as B -- > -- > main :: IO () -- > main = do -- > [path,f] <- getArgs -- > s <- mkEntrySelector f -- > bs <- withArchive path (getEntry s) -- > B.putStrLn bs module Codec.Archive.Zip ( -- * Types -- ** Entry selector EntrySelector, mkEntrySelector, unEntrySelector, getEntryName, EntrySelectorException (..), -- ** Entry description EntryDescription (..), CompressionMethod (..), -- ** Archive description ArchiveDescription (..), -- ** Exceptions ZipException (..), -- * Archive monad ZipArchive, ZipState, createArchive, withArchive, -- * Retrieving information getEntries, doesEntryExist, getEntryDesc, getEntry, getEntrySource, sourceEntry, saveEntry, checkEntry, unpackInto, getArchiveComment, getArchiveDescription, -- * Modifying archive -- ** Adding entries addEntry, sinkEntry, loadEntry, copyEntry, packDirRecur, packDirRecur', -- ** Modifying entries renameEntry, deleteEntry, recompress, setEntryComment, deleteEntryComment, setModTime, addExtraField, deleteExtraField, setExternalFileAttrs, forEntries, -- ** Operations on archive as a whole setArchiveComment, deleteArchiveComment, -- ** Control over editing undoEntryChanges, undoArchiveChanges, undoAll, commit, ) where import Codec.Archive.Zip.Internal qualified as I import Codec.Archive.Zip.Type import Conduit (PrimMonad) import Control.Monad import Control.Monad.Base (MonadBase (..)) import Control.Monad.Catch import Control.Monad.State.Strict import Control.Monad.Trans.Control (MonadBaseControl (..)) import Control.Monad.Trans.Resource (MonadResource, ResourceT) import Data.ByteString (ByteString) import Data.Conduit (ConduitT, (.|)) import Data.Conduit qualified as C import Data.Conduit.Binary qualified as CB import Data.Conduit.List qualified as CL import Data.DList qualified as DList import Data.Map.Strict (Map, (!)) import Data.Map.Strict qualified as M import Data.Sequence (Seq, (|>)) import Data.Sequence qualified as S import Data.Set qualified as E import Data.Text (Text) import Data.Time.Clock (UTCTime) import Data.Void import Data.Word (Word16, Word32) import System.Directory import System.FilePath (()) import System.FilePath qualified as FP import System.IO.Error (isDoesNotExistError) #ifndef mingw32_HOST_OS import qualified Codec.Archive.Zip.Unix as Unix import qualified System.Posix as Unix #endif ---------------------------------------------------------------------------- -- Archive monad -- | Monad that provides context necessary for performing operations on zip -- archives. It's intentionally opaque and not a monad transformer to limit -- the actions that can be performed in it to those provided by this module -- and their combinations. newtype ZipArchive a = ZipArchive { unZipArchive :: StateT ZipState IO a } deriving ( Functor, Applicative, Monad, MonadIO, MonadThrow, MonadCatch, MonadMask ) -- | @since 0.2.0 instance MonadBase IO ZipArchive where liftBase = liftIO -- | @since 0.2.0 instance MonadBaseControl IO ZipArchive where type StM ZipArchive a = (a, ZipState) liftBaseWith f = ZipArchive . StateT $ \s -> (,s) <$> f (flip runStateT s . unZipArchive) {-# INLINEABLE liftBaseWith #-} restoreM = ZipArchive . StateT . const . return {-# INLINEABLE restoreM #-} -- | The internal state record used by the 'ZipArchive' monad. This is only -- exported for use with 'MonadBaseControl' methods, you can't look inside. -- -- @since 0.2.0 data ZipState = ZipState { -- | Path to zip archive zsFilePath :: FilePath, -- | Actual collection of entries zsEntries :: Map EntrySelector EntryDescription, -- | Info about the whole archive zsArchive :: ArchiveDescription, -- | Pending actions zsActions :: Seq I.PendingAction } -- | Create a new archive given its location and an action that describes -- how to create contents of the archive. This will silently overwrite the -- specified file if it already exists. See 'withArchive' if you want to -- work with an existing archive. createArchive :: (MonadIO m) => -- | Location of the archive file to create FilePath -> -- | Actions that create the archive's content ZipArchive a -> m a createArchive path m = liftIO $ do apath <- makeAbsolute path ignoringAbsence (removeFile apath) let st = ZipState { zsFilePath = apath, zsEntries = M.empty, zsArchive = ArchiveDescription Nothing 0 0, zsActions = S.empty } action = unZipArchive (m <* commit) evalStateT action st -- | Work with an existing archive. See 'createArchive' if you want to -- create a new archive instead. -- -- This operation may fail with: -- -- * @isAlreadyInUseError@ if the file is already open and cannot be -- reopened; -- -- * @isDoesNotExistError@ if the file does not exist; -- -- * @isPermissionError@ if the user does not have permission to open -- the file; -- -- * 'ParsingFailed' when specified archive is something this library -- cannot parse (this includes multi-disk archives, for example). -- -- Please note that entries with invalid (non-portable) file names may be -- missing in the list of entries. Files that are compressed with -- unsupported compression methods are skipped as well. Also, if several -- entries would collide on some operating systems (such as Windows, because -- of its case-insensitivity), only one of them will be available, because -- 'EntrySelector' is case-insensitive. These are the consequences of the -- design decision to make it impossible to create non-portable archives -- with this library. withArchive :: (MonadIO m) => -- | Location of the archive to work with FilePath -> -- | Actions on that archive ZipArchive a -> m a withArchive path m = liftIO $ do apath <- canonicalizePath path (desc, entries) <- liftIO (I.scanArchive apath) let st = ZipState { zsFilePath = apath, zsEntries = entries, zsArchive = desc, zsActions = S.empty } action = unZipArchive (m <* commit) liftIO (evalStateT action st) ---------------------------------------------------------------------------- -- Retrieving information -- | Retrieve a description of all archive entries. This is an efficient -- operation that can be used for example to list all entries in the -- archive. Do not hesitate to use the function frequently: scanning of the -- archive happens only once. -- -- Please note that the returned value only reflects the current contents of -- the archive in file system, non-committed actions are not reflected, see -- 'commit' for more information. getEntries :: ZipArchive (Map EntrySelector EntryDescription) getEntries = ZipArchive (gets zsEntries) -- | Check whether the specified entry exists in the archive. This is a -- simple shortcut defined as: -- -- > doesEntryExist s = M.member s <$> getEntries doesEntryExist :: EntrySelector -> ZipArchive Bool doesEntryExist s = M.member s <$> getEntries -- | Get 'EntryDescription' for a specified entry. This is a simple shortcut -- defined as: -- -- > getEntryDesc s = M.lookup s <$> getEntries getEntryDesc :: EntrySelector -> ZipArchive (Maybe EntryDescription) getEntryDesc s = M.lookup s <$> getEntries -- | Get contents of a specific archive entry as a strict 'ByteString'. It's -- not recommended to use this on big entries, because it will suck out a -- lot of memory. For big entries, use conduits: 'sourceEntry'. -- -- Throws: 'EntryDoesNotExist'. getEntry :: -- | Selector that identifies archive entry EntrySelector -> -- | Contents of the entry ZipArchive ByteString getEntry s = sourceEntry s (CL.foldMap id) -- | Get an entry source. -- -- Throws: 'EntryDoesNotExist'. -- -- @since 0.1.3 getEntrySource :: (PrimMonad m, MonadThrow m, MonadResource m) => -- | Selector that identifies archive entry EntrySelector -> ZipArchive (ConduitT () ByteString m ()) getEntrySource s = do path <- getFilePath mdesc <- M.lookup s <$> getEntries case mdesc of Nothing -> throwM (EntryDoesNotExist path s) Just desc -> return (I.sourceEntry path desc True) -- | Stream contents of an archive entry to the given 'Sink'. -- -- Throws: 'EntryDoesNotExist'. sourceEntry :: -- | Selector that identifies the archive entry EntrySelector -> -- | Sink where to stream entry contents ConduitT ByteString Void (ResourceT IO) a -> -- | Contents of the entry (if found) ZipArchive a sourceEntry s sink = do src <- getEntrySource s (liftIO . C.runConduitRes) (src .| sink) -- | Save a specific archive entry as a file in the file system. -- -- Throws: 'EntryDoesNotExist'. saveEntry :: -- | Selector that identifies the archive entry EntrySelector -> -- | Where to save the file FilePath -> ZipArchive () saveEntry s path = do sourceEntry s (CB.sinkFile path) med <- getEntryDesc s forM_ med (liftIO . setModificationTime path . edModTime) -- | Calculate CRC32 check sum and compare it with the value read from the -- archive. The function returns 'True' when the check sums are the -- same—that is, the data is not corrupted. -- -- Throws: 'EntryDoesNotExist'. checkEntry :: -- | Selector that identifies the archive entry EntrySelector -> -- | Is the entry intact? ZipArchive Bool checkEntry s = do calculated <- sourceEntry s I.crc32Sink given <- edCRC32 . (! s) <$> getEntries -- NOTE We can assume that entry exists for sure because otherwise -- 'sourceEntry' would have thrown 'EntryDoesNotExist' already. return (calculated == given) -- | Unpack the archive into the specified directory. The directory will be -- created if it does not exist. unpackInto :: FilePath -> ZipArchive () unpackInto dir' = do selectors <- M.keysSet <$> getEntries unless (null selectors) $ do dir <- liftIO (makeAbsolute dir') liftIO (createDirectoryIfMissing True dir) let dirs = E.map (FP.takeDirectory . (dir ) . unEntrySelector) selectors forM_ dirs (liftIO . createDirectoryIfMissing True) forM_ selectors $ \s -> saveEntry s (dir unEntrySelector s) -- | Get the archive comment. getArchiveComment :: ZipArchive (Maybe Text) getArchiveComment = adComment <$> getArchiveDescription -- | Get the archive description record. getArchiveDescription :: ZipArchive ArchiveDescription getArchiveDescription = ZipArchive (gets zsArchive) ---------------------------------------------------------------------------- -- Modifying archive -- | Add a new entry to the archive given its contents in binary form. addEntry :: -- | The compression method to use CompressionMethod -> -- | Entry contents ByteString -> -- | Name of the entry to add EntrySelector -> ZipArchive () addEntry t b s = addPending (I.SinkEntry t (C.yield b) s) -- | Stream data from the specified source to an archive entry. sinkEntry :: -- | The compression method to use CompressionMethod -> -- | Source of entry contents ConduitT () ByteString (ResourceT IO) () -> -- | Name of the entry to add EntrySelector -> ZipArchive () sinkEntry t src s = addPending (I.SinkEntry t src s) -- | Load an entry from a given file. loadEntry :: -- | The compression method to use CompressionMethod -> -- | Name of the entry to add EntrySelector -> -- | Path to the file to add FilePath -> ZipArchive () loadEntry t s path = do apath <- liftIO (canonicalizePath path) modTime <- liftIO (getModificationTime path) let src = CB.sourceFile apath addPending (I.SinkEntry t src s) addPending (I.SetModTime modTime s) #ifndef mingw32_HOST_OS status <- liftIO $ Unix.getFileStatus path setExternalFileAttrs (Unix.fromFileMode (Unix.fileMode status)) s #endif -- | Copy an entry “as is” from another zip archive. If the entry does not -- exist in that archive, 'EntryDoesNotExist' will be thrown. copyEntry :: -- | Path to the archive to copy from FilePath -> -- | Name of the entry (in the source archive) to copy EntrySelector -> -- | Name of the entry to insert (in current archive) EntrySelector -> ZipArchive () copyEntry path s' s = do apath <- liftIO (canonicalizePath path) addPending (I.CopyEntry apath s' s) -- | Add an directory to the archive. Please note that due to the design of -- the library, empty sub-directories will not be added. -- -- The action can throw 'InvalidEntrySelector'. packDirRecur :: -- | The compression method to use CompressionMethod -> -- | How to get the 'EntrySelector' from a path relative to the root of -- the directory we pack (FilePath -> ZipArchive EntrySelector) -> -- | Path to the directory to add FilePath -> ZipArchive () packDirRecur t f = packDirRecur' t f (const $ return ()) -- | The same as 'packDirRecur' but allows us to perform modifying actions -- on the created entities as we go. -- -- @since 1.5.0 packDirRecur' :: -- | The compression method to use CompressionMethod -> -- | How to get the 'EntrySelector' from a path relative to the root of -- the directory we pack (FilePath -> ZipArchive EntrySelector) -> -- | How to modify an entry after creation (EntrySelector -> ZipArchive ()) -> -- | Path to the directory to add FilePath -> ZipArchive () packDirRecur' t f patch path = do files <- liftIO (listDirRecur path) forM_ files $ \x -> do s <- f x loadEntry t s (path x) patch s -- | Rename an entry in the archive. If the entry does not exist, nothing -- will happen. renameEntry :: -- | The original entry name EntrySelector -> -- | The new entry name EntrySelector -> ZipArchive () renameEntry old new = addPending (I.RenameEntry old new) -- | Delete an entry from the archive, if it does not exist, nothing will -- happen. deleteEntry :: EntrySelector -> ZipArchive () deleteEntry s = addPending (I.DeleteEntry s) -- | Change compression method of an entry, if it does not exist, nothing -- will happen. recompress :: -- | The new compression method CompressionMethod -> -- | Name of the entry to re-compress EntrySelector -> ZipArchive () recompress t s = addPending (I.Recompress t s) -- | Set an entry comment, if that entry does not exist, nothing will -- happen. Note that if binary representation of the comment is longer than -- 65535 bytes, it will be truncated on writing. setEntryComment :: -- | Text of the comment Text -> -- | Name of the entry to comment on EntrySelector -> ZipArchive () setEntryComment text s = addPending (I.SetEntryComment text s) -- | Delete an entry's comment, if that entry does not exist, nothing will -- happen. deleteEntryComment :: EntrySelector -> ZipArchive () deleteEntryComment s = addPending (I.DeleteEntryComment s) -- | Set the last modification date\/time. The specified entry may be -- missing, in that case the action has no effect. setModTime :: -- | New modification time UTCTime -> -- | Name of the entry to modify EntrySelector -> ZipArchive () setModTime time s = addPending (I.SetModTime time s) -- | Add an extra field. The specified entry may be missing, in that case -- this action has no effect. addExtraField :: -- | Tag (header id) of the extra field to add Word16 -> -- | Body of the field ByteString -> -- | Name of the entry to modify EntrySelector -> ZipArchive () addExtraField n b s = addPending (I.AddExtraField n b s) -- | Delete an extra field by its type (tag). The specified entry may be -- missing, in that case this action has no effect. deleteExtraField :: -- | Tag (header id) of the extra field to delete Word16 -> -- | Name of the entry to modify EntrySelector -> ZipArchive () deleteExtraField n s = addPending (I.DeleteExtraField n s) -- | Set external file attributes. This function can be used to set file -- permissions. -- -- See also: "Codec.Archive.Zip.Unix". -- -- @since 1.2.0 setExternalFileAttrs :: -- | External file attributes Word32 -> -- | Name of the entry to modify EntrySelector -> ZipArchive () setExternalFileAttrs attrs s = addPending (I.SetExternalFileAttributes attrs s) -- | Perform an action on every entry in the archive. forEntries :: -- | The action to perform (EntrySelector -> ZipArchive ()) -> ZipArchive () forEntries action = getEntries >>= mapM_ action . M.keysSet -- | Set the comment of the entire archive. setArchiveComment :: Text -> ZipArchive () setArchiveComment text = addPending (I.SetArchiveComment text) -- | Delete the archive's comment if it's present. deleteArchiveComment :: ZipArchive () deleteArchiveComment = addPending I.DeleteArchiveComment -- | Undo the changes to a specific archive entry. undoEntryChanges :: EntrySelector -> ZipArchive () undoEntryChanges s = modifyActions f where f = S.filter ((/= Just s) . I.targetEntry) -- | Undo the changes to the archive as a whole (archive's comment). undoArchiveChanges :: ZipArchive () undoArchiveChanges = modifyActions f where f = S.filter ((/= Nothing) . I.targetEntry) -- | Undo all changes made in this editing session. undoAll :: ZipArchive () undoAll = modifyActions (const S.empty) -- | Archive contents are not modified instantly, but instead changes are -- collected as “pending actions” that should be committed, in order to -- efficiently modify the archive in one pass. The actions are committed -- automatically when the program leaves the 'ZipArchive' monad (i.e. as -- part of 'createArchive' or 'withArchive'), or can be forced explicitly -- with the help of this function. Once committed, changes take place in the -- file system and cannot be undone. commit :: ZipArchive () commit = do file <- getFilePath odesc <- getArchiveDescription oentries <- getEntries actions <- getPending exists <- liftIO (doesFileExist file) unless (S.null actions && exists) $ do liftIO (I.commit file odesc oentries actions) -- NOTE The most robust way to update the internal description of the -- archive is to scan it again—manual manipulations with descriptions of -- entries are too error-prone. We also want to erase all pending -- actions because 'I.commit' executes them all by definition. (ndesc, nentries) <- liftIO (I.scanArchive file) ZipArchive . modify $ \st -> st { zsEntries = nentries, zsArchive = ndesc, zsActions = S.empty } ---------------------------------------------------------------------------- -- Helpers -- | Get the path of the actual archive file from inside of 'ZipArchive' -- monad. getFilePath :: ZipArchive FilePath getFilePath = ZipArchive (gets zsFilePath) -- | Get the collection of pending actions. getPending :: ZipArchive (Seq I.PendingAction) getPending = ZipArchive (gets zsActions) -- | Modify the collection of pending actions. modifyActions :: (Seq I.PendingAction -> Seq I.PendingAction) -> ZipArchive () modifyActions f = ZipArchive (modify g) where g st = st {zsActions = f (zsActions st)} -- | Add a new action to the list of pending actions. addPending :: I.PendingAction -> ZipArchive () addPending a = modifyActions (|> a) -- | Recursively list a directory. Do not return paths to empty directories. listDirRecur :: FilePath -> IO [FilePath] listDirRecur path = DList.toList <$> go "" where go adir = do let cdir = path adir raw <- listDirectory cdir fmap mconcat . forM raw $ \case "" -> return mempty "." -> return mempty ".." -> return mempty x -> do let fullx = cdir x adir' = adir x isFile <- doesFileExist fullx isDir <- doesDirectoryExist fullx if isFile then return (DList.singleton adir') else if isDir then go adir' else return mempty -- | Perform an action ignoring IO exceptions it may throw. ignoringAbsence :: IO () -> IO () ignoringAbsence io = catchJust select io handler where select e = if isDoesNotExistError e then Just e else Nothing handler = const (return ()) zip-2.0.1/Codec/Archive/Zip/0000755000000000000000000000000007346545000013645 5ustar0000000000000000zip-2.0.1/Codec/Archive/Zip/CP437.hs0000644000000000000000000000514207346545000014743 0ustar0000000000000000-- | -- Module : Codec.Archive.Zip.CP437 -- Copyright : © 2016–present Mark Karpov -- License : BSD 3 clause -- -- Maintainer : Mark Karpov -- Stability : experimental -- Portability : portable -- -- Support for decoding of CP 437 text. module Codec.Archive.Zip.CP437 ( decodeCP437, ) where import Control.Arrow (first) import Data.ByteString (ByteString) import Data.ByteString qualified as B import Data.Char import Data.Text (Text) import Data.Text qualified as T import Data.Word (Word8) -- | Decode a 'ByteString' containing CP 437 encoded text. decodeCP437 :: ByteString -> Text decodeCP437 bs = T.unfoldrN (B.length bs) (fmap (first decodeByteCP437) . B.uncons) bs -- | Decode a single byte of CP437 encoded text. decodeByteCP437 :: Word8 -> Char decodeByteCP437 byte = chr $ case byte of 128 -> 199 129 -> 252 130 -> 233 131 -> 226 132 -> 228 133 -> 224 134 -> 229 135 -> 231 136 -> 234 137 -> 235 138 -> 232 139 -> 239 140 -> 238 141 -> 236 142 -> 196 143 -> 197 144 -> 201 145 -> 230 146 -> 198 147 -> 244 148 -> 246 149 -> 242 150 -> 251 151 -> 249 152 -> 255 153 -> 214 154 -> 220 155 -> 162 156 -> 163 157 -> 165 158 -> 8359 159 -> 402 160 -> 225 161 -> 237 162 -> 243 163 -> 250 164 -> 241 165 -> 209 166 -> 170 167 -> 186 168 -> 191 169 -> 8976 170 -> 172 171 -> 189 172 -> 188 173 -> 161 174 -> 171 175 -> 187 176 -> 9617 177 -> 9618 178 -> 9619 179 -> 9474 180 -> 9508 181 -> 9569 182 -> 9570 183 -> 9558 184 -> 9557 185 -> 9571 186 -> 9553 187 -> 9559 188 -> 9565 189 -> 9564 190 -> 9563 191 -> 9488 192 -> 9492 193 -> 9524 194 -> 9516 195 -> 9500 196 -> 9472 197 -> 9532 198 -> 9566 199 -> 9567 200 -> 9562 201 -> 9556 202 -> 9577 203 -> 9574 204 -> 9568 205 -> 9552 206 -> 9580 207 -> 9575 208 -> 9576 209 -> 9572 210 -> 9573 211 -> 9561 212 -> 9560 213 -> 9554 214 -> 9555 215 -> 9579 216 -> 9578 217 -> 9496 218 -> 9484 219 -> 9608 220 -> 9604 221 -> 9612 222 -> 9616 223 -> 9600 224 -> 945 225 -> 223 226 -> 915 227 -> 960 228 -> 931 229 -> 963 230 -> 181 231 -> 964 232 -> 934 233 -> 920 234 -> 937 235 -> 948 236 -> 8734 237 -> 966 238 -> 949 239 -> 8745 240 -> 8801 241 -> 177 242 -> 8805 243 -> 8804 244 -> 8992 245 -> 8993 246 -> 247 247 -> 8776 248 -> 176 249 -> 8729 250 -> 183 251 -> 8730 252 -> 8319 253 -> 178 254 -> 9632 255 -> 160 x -> fromIntegral x -- the rest of characters translate directly zip-2.0.1/Codec/Archive/Zip/Internal.hs0000644000000000000000000012737407346545000015773 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TupleSections #-} -- | -- Module : Codec.Archive.Zip.Internal -- Copyright : © 2016–present Mark Karpov -- License : BSD 3 clause -- -- Maintainer : Mark Karpov -- Stability : experimental -- Portability : portable -- -- Low-level, non-public types and operations. module Codec.Archive.Zip.Internal ( PendingAction (..), targetEntry, scanArchive, sourceEntry, crc32Sink, commit, ) where import Codec.Archive.Zip.CP437 (decodeCP437) import Codec.Archive.Zip.Type import Conduit (PrimMonad) import Control.Applicative (many, (<|>)) import Control.Exception (bracketOnError, catchJust) import Control.Monad import Control.Monad.Catch (MonadThrow (..)) import Control.Monad.Trans.Maybe import Control.Monad.Trans.Resource (MonadResource, ResourceT) import Data.Bits import Data.Bool (bool) import Data.ByteString (ByteString) import Data.ByteString qualified as B import Data.Char (ord) import Data.Conduit (ConduitT, ZipSink (..), (.|)) import Data.Conduit qualified as C import Data.Conduit.Binary qualified as CB import Data.Conduit.List qualified as CL import Data.Conduit.Zlib qualified as Z import Data.Digest.CRC32 (crc32Update) import Data.Fixed (Fixed (..)) import Data.Foldable (foldl') import Data.Map.Strict (Map, (!)) import Data.Map.Strict qualified as M import Data.Maybe (catMaybes, fromJust, isNothing) import Data.Sequence (Seq, (><), (|>)) import Data.Sequence qualified as S import Data.Serialize import Data.Set qualified as E import Data.Text (Text) import Data.Text qualified as T import Data.Text.Encoding qualified as T import Data.Time import Data.Version import Data.Void import Data.Word (Word16, Word32) import Numeric.Natural (Natural) import System.Directory import System.FilePath import System.IO import System.IO.Error (isDoesNotExistError) #ifndef mingw32_HOST_OS import qualified Codec.Archive.Zip.Unix as Unix #endif #ifdef ENABLE_BZIP2 import qualified Data.Conduit.BZlib as BZ #endif #ifdef ENABLE_ZSTD import qualified Data.Conduit.Zstd as Zstandard #endif ---------------------------------------------------------------------------- -- Data types -- | The sum type describes all possible actions that can be performed on an -- archive. data PendingAction = -- | Add an entry given its 'Source' SinkEntry CompressionMethod (ConduitT () ByteString (ResourceT IO) ()) EntrySelector | -- | Copy an entry form another archive without re-compression CopyEntry FilePath EntrySelector EntrySelector | -- | Change the name of the entry inside archive RenameEntry EntrySelector EntrySelector | -- | Delete an entry from archive DeleteEntry EntrySelector | -- | Change the compression method on an entry Recompress CompressionMethod EntrySelector | -- | Set the comment for a particular entry SetEntryComment Text EntrySelector | -- | Delete theh comment of a particular entry DeleteEntryComment EntrySelector | -- | Set the modification time of a particular entry SetModTime UTCTime EntrySelector | -- | Add an extra field to the specified entry AddExtraField Word16 ByteString EntrySelector | -- | Delete an extra filed of the specified entry DeleteExtraField Word16 EntrySelector | -- | Set the comment for the entire archive SetArchiveComment Text | -- | Delete the comment of the entire archive DeleteArchiveComment | -- | Set an external file attribute for the specified entry SetExternalFileAttributes Word32 EntrySelector -- | A collection of maps describing how to produce entries in the resulting -- archive. data ProducingActions = ProducingActions { paCopyEntry :: Map FilePath (Map EntrySelector EntrySelector), paSinkEntry :: Map EntrySelector (ConduitT () ByteString (ResourceT IO) ()) } -- | A collection of editing actions, that is, actions that modify already -- existing entries. data EditingActions = EditingActions { eaCompression :: Map EntrySelector CompressionMethod, eaEntryComment :: Map EntrySelector Text, eaDeleteComment :: Map EntrySelector (), eaModTime :: Map EntrySelector UTCTime, eaExtraField :: Map EntrySelector (Map Word16 ByteString), eaDeleteField :: Map EntrySelector (Map Word16 ()), eaExtFileAttr :: Map EntrySelector Word32 } -- | The origin of entries that can be streamed into archive. data EntryOrigin = GenericOrigin | Borrowed EntryDescription -- | The type of the file header: local or central directory. data HeaderType = LocalHeader | CentralDirHeader deriving (Eq) -- | The data descriptor representation. data DataDescriptor = DataDescriptor { ddCRC32 :: Word32, ddCompressedSize :: Natural, ddUncompressedSize :: Natural } -- | A temporary data structure to hold Zip64 extra data field information. data Zip64ExtraField = Zip64ExtraField { z64efUncompressedSize :: Natural, z64efCompressedSize :: Natural, z64efOffset :: Natural } -- | MS-DOS date-time: a pair of 'Word16' (date, time) with the following -- structure: -- -- > DATE bit 0 - 4 5 - 8 9 - 15 -- > value day (1 - 31) month (1 - 12) years from 1980 -- > TIME bit 0 - 4 5 - 10 11 - 15 -- > value seconds* minute hour -- > *stored in two-second increments data MsDosTime = MsDosTime { msDosDate :: Word16, msDosTime :: Word16 } ---------------------------------------------------------------------------- -- Constants -- | “Version created by” to specify when writing archive data. zipVersion :: Version zipVersion = Version [6, 3] [] ---------------------------------------------------------------------------- -- Higher-level operations -- | Scan the central directory of an archive and return its description -- 'ArchiveDescription' as well as a collection of its entries. -- -- This operation may fail with: -- -- * @isAlreadyInUseError@ if the file is already open and cannot be -- reopened; -- -- * @isDoesNotExistError@ if the file does not exist; -- -- * @isPermissionError@ if the user does not have permission to open -- the file; -- -- * 'ParsingFailed' when specified archive is something this library -- cannot parse (this includes multi-disk archives, for example). -- -- Please note that entries with invalid (non-portable) file names may be -- missing in the list of entries. Files that are compressed with -- unsupported compression methods are skipped as well. Also, if several -- entries would collide on some operating systems (such as Windows, because -- of its case-insensitivity), only one of them will be available, because -- 'EntrySelector' is case-insensitive. These are the consequences of the -- design decision to make it impossible to create non-portable archives -- with this library. scanArchive :: -- | Path to archive to scan FilePath -> IO (ArchiveDescription, Map EntrySelector EntryDescription) scanArchive path = withBinaryFile path ReadMode $ \h -> do mecdOffset <- locateECD path h case mecdOffset of Just ecdOffset -> do hSeek h AbsoluteSeek ecdOffset ecdSize <- subtract ecdOffset <$> hFileSize h ecdRaw <- B.hGet h (fromIntegral ecdSize) case runGet getECD ecdRaw of Left msg -> throwM (ParsingFailed path msg) Right ecd -> do hSeek h AbsoluteSeek $ fromIntegral (adCDOffset ecd) cdRaw <- B.hGet h $ fromIntegral (adCDSize ecd) case runGet getCD cdRaw of Left msg -> throwM (ParsingFailed path msg) Right cd -> return (ecd, cd) Nothing -> throwM (ParsingFailed path "Cannot locate end of central directory") -- | Given location of the archive and information about a specific archive -- entry 'EntryDescription', return 'Source' of its data. The actual data -- can be compressed or uncompressed depending on the third argument. sourceEntry :: (PrimMonad m, MonadThrow m, MonadResource m) => -- | Path to archive that contains the entry FilePath -> -- | Information needed to extract entry of interest EntryDescription -> -- | Should we stream uncompressed data? Bool -> -- | Source of uncompressed data ConduitT () ByteString m () sourceEntry path EntryDescription {..} d = source .| CB.isolate (fromIntegral edCompressedSize) .| decompress where source = CB.sourceIOHandle $ do h <- openFile path ReadMode hSeek h AbsoluteSeek (fromIntegral edOffset) localHeader <- B.hGet h 30 case runGet getLocalHeaderGap localHeader of Left msg -> throwM (ParsingFailed path msg) Right gap -> do hSeek h RelativeSeek gap return h decompress = if d then decompressingPipe edCompression else C.awaitForever C.yield -- | Undertake /all/ actions specified as the fourth argument of the -- function. This transforms the given pending actions so they can be -- performed in one pass, and then they are applied in the most efficient -- way. commit :: -- | Location of archive file to edit or create FilePath -> -- | Archive description ArchiveDescription -> -- | Current list of entires Map EntrySelector EntryDescription -> -- | Collection of pending actions Seq PendingAction -> IO () commit path ArchiveDescription {..} entries xs = withNewFile path $ \h -> do let (ProducingActions coping sinking, editing) = optimize (toRecreatingActions path entries >< xs) comment = predictComment adComment xs copiedCD <- M.unions <$> forM (M.keys coping) ( \srcPath -> copyEntries h srcPath (coping ! srcPath) editing ) let sinkingKeys = M.keys $ sinking `M.difference` copiedCD sunkCD <- M.fromList <$> forM sinkingKeys ( \selector -> sinkEntry h selector GenericOrigin (sinking ! selector) editing ) writeCD h comment (copiedCD `M.union` sunkCD) -- | Create a new file with the guarantee that in the case of an exception -- the old file will be intact. The file is only updated\/replaced if the -- second argument finishes without exceptions. withNewFile :: -- | Name of file to create FilePath -> -- | Action that writes to given 'Handle' (Handle -> IO ()) -> IO () withNewFile fpath action = bracketOnError allocate release $ \(path, h) -> do action h hClose h renameFile path fpath where allocate = openBinaryTempFile (takeDirectory fpath) ".zip" release (path, h) = do hClose h -- Despite using `bracketOnError` the file is not guaranteed to exist -- here since we could be interrupted with an async exception after -- the file has been renamed. Therefore, we silentely ignore -- `DoesNotExistError`. catchJust (guard . isDoesNotExistError) (removeFile path) (const $ pure ()) -- | Determine what comment in new archive will look like given its original -- value and a collection of pending actions. predictComment :: Maybe Text -> Seq PendingAction -> Maybe Text predictComment original xs = case S.index xs <$> S.findIndexR (isNothing . targetEntry) xs of Nothing -> original Just DeleteArchiveComment -> Nothing Just (SetArchiveComment txt) -> Just txt Just _ -> Nothing -- | Transform a map representing existing entries into a collection of -- actions that re-create those entires. toRecreatingActions :: -- | Name of the archive file where entires are found FilePath -> -- | Actual list of entires Map EntrySelector EntryDescription -> -- | Actions that recreate the archive entries Seq PendingAction toRecreatingActions path entries = E.foldl' f S.empty (M.keysSet entries) where f s e = s |> CopyEntry path e e -- | Transform a collection of 'PendingAction's into 'ProducingActions' and -- 'EditingActions'—data that describes how to create resulting archive. optimize :: -- | Collection of pending actions Seq PendingAction -> -- | Optimized data (ProducingActions, EditingActions) optimize = foldl' f ( ProducingActions M.empty M.empty, EditingActions M.empty M.empty M.empty M.empty M.empty M.empty M.empty ) where f (pa, ea) a = case a of SinkEntry m src s -> ( pa { paSinkEntry = M.insert s src (paSinkEntry pa), paCopyEntry = M.map (M.filter (/= s)) (paCopyEntry pa) }, (clearEditingFor s ea) { eaCompression = M.insert s m (eaCompression ea) } ) CopyEntry path os ns -> ( pa { paSinkEntry = M.delete ns (paSinkEntry pa), paCopyEntry = M.alter (ef os ns) path (paCopyEntry pa) }, clearEditingFor ns ea ) RenameEntry os ns -> ( pa { paCopyEntry = M.map (M.map $ re os ns) (paCopyEntry pa), paSinkEntry = renameKey os ns (paSinkEntry pa) }, ea { eaCompression = renameKey os ns (eaCompression ea), eaEntryComment = renameKey os ns (eaEntryComment ea), eaDeleteComment = renameKey os ns (eaDeleteComment ea), eaModTime = renameKey os ns (eaModTime ea), eaExtraField = renameKey os ns (eaExtraField ea), eaDeleteField = renameKey os ns (eaDeleteField ea) } ) DeleteEntry s -> ( pa { paSinkEntry = M.delete s (paSinkEntry pa), paCopyEntry = M.map (M.delete s) (paCopyEntry pa) }, clearEditingFor s ea ) Recompress m s -> (pa, ea {eaCompression = M.insert s m (eaCompression ea)}) SetEntryComment txt s -> ( pa, ea { eaEntryComment = M.insert s txt (eaEntryComment ea), eaDeleteComment = M.delete s (eaDeleteComment ea) } ) DeleteEntryComment s -> ( pa, ea { eaEntryComment = M.delete s (eaEntryComment ea), eaDeleteComment = M.insert s () (eaDeleteComment ea) } ) SetModTime time s -> (pa, ea {eaModTime = M.insert s time (eaModTime ea)}) AddExtraField n b s -> ( pa, ea { eaExtraField = M.alter (ef n b) s (eaExtraField ea), eaDeleteField = M.delete s (eaDeleteField ea) } ) DeleteExtraField n s -> ( pa, ea { eaExtraField = M.alter (er n) s (eaExtraField ea), eaDeleteField = M.alter (ef n ()) s (eaDeleteField ea) } ) SetExternalFileAttributes b s -> ( pa, ea {eaExtFileAttr = M.insert s b (eaExtFileAttr ea)} ) _ -> (pa, ea) clearEditingFor s ea = ea { eaCompression = M.delete s (eaCompression ea), eaEntryComment = M.delete s (eaEntryComment ea), eaDeleteComment = M.delete s (eaDeleteComment ea), eaModTime = M.delete s (eaModTime ea), eaExtraField = M.delete s (eaExtraField ea), eaDeleteField = M.delete s (eaDeleteField ea), eaExtFileAttr = M.delete s (eaExtFileAttr ea) } re o n x = if x == o then n else x ef k v (Just m) = Just (M.insert k v m) ef k v Nothing = Just (M.singleton k v) er k (Just m) = let n = M.delete k m in if M.null n then Nothing else Just n er _ Nothing = Nothing -- | Copy entries from another archive and write them into the file -- associated with the given handle. This can throw 'EntryDoesNotExist' if -- there is no such entry in that archive. copyEntries :: -- | Opened 'Handle' of zip archive file Handle -> -- | Path to the file to copy the entries from FilePath -> -- | 'Map' from original name to name to use in new archive Map EntrySelector EntrySelector -> -- | Additional info that can influence result EditingActions -> -- | Info to generate central directory file headers later IO (Map EntrySelector EntryDescription) copyEntries h path m e = do entries <- snd <$> scanArchive path done <- forM (M.keys m) $ \s -> case s `M.lookup` entries of Nothing -> throwM (EntryDoesNotExist path s) Just desc -> sinkEntry h (m ! s) (Borrowed desc) (sourceEntry path desc False) e return (M.fromList done) -- | Sink an entry from the given stream into the file associated with the -- given 'Handle'. sinkEntry :: -- | Opened 'Handle' of zip archive file Handle -> -- | Name of the entry to add EntrySelector -> -- | Origin of the entry (can contain additional info) EntryOrigin -> -- | Source of the entry contents ConduitT () ByteString (ResourceT IO) () -> -- | Additional info that can influence result EditingActions -> -- | Info to generate the central directory file headers later IO (EntrySelector, EntryDescription) sinkEntry h s o src EditingActions {..} = do currentTime <- getCurrentTime offset <- hTell h let compressed = case o of GenericOrigin -> Store Borrowed ed -> edCompression ed compression = M.findWithDefault compressed s eaCompression recompression = compression /= compressed modTime = case o of GenericOrigin -> currentTime Borrowed ed -> edModTime ed extFileAttr = case o of GenericOrigin -> M.findWithDefault defaultFileMode s eaExtFileAttr Borrowed _ -> M.findWithDefault defaultFileMode s eaExtFileAttr oldExtraFields = case o of GenericOrigin -> M.empty Borrowed ed -> edExtraField ed extraField = (M.findWithDefault M.empty s eaExtraField `M.union` oldExtraFields) `M.difference` M.findWithDefault M.empty s eaDeleteField oldComment = case (o, M.lookup s eaDeleteComment) of (GenericOrigin, _) -> Nothing (Borrowed ed, Nothing) -> edComment ed (Borrowed _, Just ()) -> Nothing desc0 = EntryDescription -- to write in local header { edVersionMadeBy = zipVersion, edVersionNeeded = zipVersion, edCompression = compression, edModTime = M.findWithDefault modTime s eaModTime, edCRC32 = 0, -- to be overwritten after streaming edCompressedSize = 0, -- ↑ edUncompressedSize = 0, -- ↑ edOffset = fromIntegral offset, edComment = M.lookup s eaEntryComment <|> oldComment, edExtraField = extraField, edExternalFileAttrs = extFileAttr } B.hPut h (runPut (putHeader LocalHeader s desc0)) DataDescriptor {..} <- C.runConduitRes $ if recompression then if compressed == Store then src .| sinkData h compression else src .| decompressingPipe compressed .| sinkData h compression else src .| sinkData h Store afterStreaming <- hTell h let desc1 = case o of GenericOrigin -> desc0 { edCRC32 = ddCRC32, edCompressedSize = ddCompressedSize, edUncompressedSize = ddUncompressedSize } Borrowed ed -> desc0 { edCRC32 = bool (edCRC32 ed) ddCRC32 recompression, edCompressedSize = bool (edCompressedSize ed) ddCompressedSize recompression, edUncompressedSize = bool (edUncompressedSize ed) ddUncompressedSize recompression } desc2 = desc1 { edVersionNeeded = getZipVersion (needsZip64 desc1) (Just compression) } hSeek h AbsoluteSeek offset B.hPut h (runPut (putHeader LocalHeader s desc2)) hSeek h AbsoluteSeek afterStreaming return (s, desc2) {- ORMOLU_DISABLE -} -- | Create a 'Sink' to stream data there. Once streaming is finished, -- return 'DataDescriptor' for the streamed data. The action /does not/ -- close the given 'Handle'. sinkData :: -- | Opened 'Handle' of zip archive file Handle -> -- | Compression method to apply CompressionMethod -> -- | 'Sink' where to stream data ConduitT ByteString Void (ResourceT IO) DataDescriptor sinkData h compression = do let sizeSink = CL.fold (\acc input -> fromIntegral (B.length input) + acc) 0 dataSink = getZipSink $ ZipSink sizeSink <* ZipSink (CB.sinkHandle h) withCompression sink = getZipSink $ (,,) <$> ZipSink sizeSink <*> ZipSink crc32Sink <*> ZipSink sink (uncompressedSize, crc32, compressedSize) <- case compression of Store -> withCompression dataSink Deflate -> withCompression $ Z.compress 9 (Z.WindowBits (-15)) .| dataSink #ifdef ENABLE_BZIP2 BZip2 -> withCompression $ BZ.bzip2 .| dataSink #else BZip2 -> throwM (UnsupportedCompressionMethod BZip2) #endif #ifdef ENABLE_ZSTD Zstd -> withCompression $ Zstandard.compress 1 .| dataSink #else Zstd -> throwM (UnsupportedCompressionMethod Zstd) #endif return DataDescriptor { ddCRC32 = fromIntegral crc32, ddCompressedSize = compressedSize, ddUncompressedSize = uncompressedSize } {- ORMOLU_ENABLE -} -- | Append central directory entries and the end of central directory -- record to the file that given 'Handle' is associated with. Note that this -- automatically writes Zip64 end of central directory record and Zip64 end -- of central directory locator when necessary. writeCD :: -- | Opened handle of zip archive file Handle -> -- | Commentary to the entire archive Maybe Text -> -- | Info about already written local headers and entry data Map EntrySelector EntryDescription -> IO () writeCD h comment m = do let cd = runPut (putCD m) cdOffset <- fromIntegral <$> hTell h B.hPut h cd -- write central directory let totalCount = fromIntegral (M.size m) cdSize = fromIntegral (B.length cd) needZip64 = totalCount >= ffff || cdSize >= ffffffff || cdOffset >= ffffffff when needZip64 $ do zip64ecdOffset <- fromIntegral <$> hTell h (B.hPut h . runPut) (putZip64ECD totalCount cdSize cdOffset) (B.hPut h . runPut) (putZip64ECDLocator zip64ecdOffset) (B.hPut h . runPut) (putECD totalCount cdSize cdOffset comment) ---------------------------------------------------------------------------- -- Binary serialization -- | Extract the number of bytes between the start of file name in local -- header and the start of actual data. getLocalHeaderGap :: Get Integer getLocalHeaderGap = do getSignature 0x04034b50 skip 2 -- version needed to extract skip 2 -- general purpose bit flag skip 2 -- compression method skip 2 -- last mod file time skip 2 -- last mod file date skip 4 -- crc-32 check sum skip 4 -- compressed size skip 4 -- uncompressed size fileNameSize <- fromIntegral <$> getWord16le -- file name length extraFieldSize <- fromIntegral <$> getWord16le -- extra field length return (fileNameSize + extraFieldSize) -- | Parse central directory file headers and put them into a 'Map'. getCD :: Get (Map EntrySelector EntryDescription) getCD = M.fromList . catMaybes <$> many getCDHeader -- | Parse a single central directory file header. If it's a directory or -- file compressed with unsupported compression method, 'Nothing' is -- returned. getCDHeader :: Get (Maybe (EntrySelector, EntryDescription)) getCDHeader = do getSignature 0x02014b50 -- central file header signature versionMadeBy <- toVersion <$> getWord16le -- version made by versionNeeded <- toVersion <$> getWord16le -- version needed to extract when (versionNeeded > zipVersion) . fail $ "Version required to extract the archive is " ++ showVersion versionNeeded ++ " (can do " ++ showVersion zipVersion ++ ")" bitFlag <- getWord16le -- general purpose bit flag when (any (testBit bitFlag) [0, 6, 13]) . fail $ "Encrypted archives are not supported" let needUnicode = testBit bitFlag 11 mcompression <- toCompressionMethod <$> getWord16le -- compression method modTime <- getWord16le -- last mod file time modDate <- getWord16le -- last mod file date crc32 <- getWord32le -- CRC32 check sum compressed <- fromIntegral <$> getWord32le -- compressed size uncompressed <- fromIntegral <$> getWord32le -- uncompressed size fileNameSize <- getWord16le -- file name length extraFieldSize <- getWord16le -- extra field length commentSize <- getWord16le -- file comment size skip 4 -- disk number start, internal file attributes externalFileAttrs <- getWord32le -- external file attributes offset <- fromIntegral <$> getWord32le -- offset of local header fileName <- decodeText needUnicode <$> getBytes (fromIntegral fileNameSize) -- file name extraField <- M.fromList <$> isolate (fromIntegral extraFieldSize) (many getExtraField) -- ↑ extra fields in their raw form comment <- decodeText needUnicode <$> getBytes (fromIntegral commentSize) -- ↑ file comment let dfltZip64 = Zip64ExtraField { z64efUncompressedSize = uncompressed, z64efCompressedSize = compressed, z64efOffset = offset } z64ef = case M.lookup 1 extraField of Nothing -> dfltZip64 Just b -> parseZip64ExtraField dfltZip64 b case mcompression of Nothing -> return Nothing Just compression -> let desc = EntryDescription { edVersionMadeBy = versionMadeBy, edVersionNeeded = versionNeeded, edCompression = compression, edModTime = fromMsDosTime (MsDosTime modDate modTime), edCRC32 = crc32, edCompressedSize = z64efCompressedSize z64ef, edUncompressedSize = z64efUncompressedSize z64ef, edOffset = z64efOffset z64ef, edComment = if commentSize == 0 then Nothing else comment, edExtraField = extraField, edExternalFileAttrs = externalFileAttrs } in return $ (,desc) <$> (fileName >>= mkEntrySelector . T.unpack) -- | Parse an extra-field. getExtraField :: Get (Word16, ByteString) getExtraField = do header <- getWord16le -- header id size <- getWord16le -- data size body <- getBytes (fromIntegral size) -- content return (header, body) -- | Get signature. If the extracted data is not equal to the provided -- signature, fail. getSignature :: Word32 -> Get () getSignature sig = do x <- getWord32le -- grab 4-byte signature unless (x == sig) . fail $ "Expected signature " ++ show sig ++ ", but got: " ++ show x -- | Parse 'Zip64ExtraField' from its binary representation. parseZip64ExtraField :: -- | What is read from central directory file header Zip64ExtraField -> -- | Actual binary representation ByteString -> -- | Result Zip64ExtraField parseZip64ExtraField dflt@Zip64ExtraField {..} b = either (const dflt) id . flip runGet b $ do let ifsat v = if v >= ffffffff then fromIntegral <$> getWord64le else return v uncompressed <- ifsat z64efUncompressedSize -- uncompressed size compressed <- ifsat z64efCompressedSize -- compressed size offset <- ifsat z64efOffset -- offset of local file header return (Zip64ExtraField uncompressed compressed offset) -- | Produce binary representation of 'Zip64ExtraField'. makeZip64ExtraField :: -- | Is this for local or central directory header? HeaderType -> -- | Zip64 extra field's data Zip64ExtraField -> -- | Resulting representation ByteString makeZip64ExtraField headerType Zip64ExtraField {..} = runPut $ do case headerType of LocalHeader -> do putWord64le (fromIntegral z64efUncompressedSize) -- uncompressed size putWord64le (fromIntegral z64efCompressedSize) -- compressed size CentralDirHeader -> do when (z64efUncompressedSize >= ffffffff) $ putWord64le (fromIntegral z64efUncompressedSize) -- uncompressed size when (z64efCompressedSize >= ffffffff) $ putWord64le (fromIntegral z64efCompressedSize) -- compressed size when (z64efOffset >= ffffffff) $ putWord64le (fromIntegral z64efOffset) -- offset of local file header -- | Create 'ByteString' representing an extra field. putExtraField :: Map Word16 ByteString -> Put putExtraField m = forM_ (M.keys m) $ \headerId -> do let b = B.take 0xffff (m ! headerId) putWord16le headerId putWord16le (fromIntegral $ B.length b) putByteString b -- | Create 'ByteString' representing the entire central directory. putCD :: Map EntrySelector EntryDescription -> Put putCD m = forM_ (M.keys m) $ \s -> putHeader CentralDirHeader s (m ! s) -- | Create 'ByteString' representing either a local file header or a -- central directory file header. putHeader :: -- | Type of header to generate HeaderType -> -- | Name of entry to write EntrySelector -> -- | Description of entry EntryDescription -> Put putHeader headerType s entry@EntryDescription {..} = do let isLocalHeader = headerType == LocalHeader isCentralDirHeader = headerType == CentralDirHeader putWord32le (bool 0x04034b50 0x02014b50 isCentralDirHeader) -- ↑ local/central file header signature when isCentralDirHeader $ putWord16le (fromVersion edVersionMadeBy) -- version made by putWord16le (fromVersion edVersionNeeded) -- version needed to extract let entryName = getEntryName s rawName = T.encodeUtf8 entryName comment = B.take 0xffff (maybe B.empty T.encodeUtf8 edComment) unicode = needsUnicode entryName || maybe False needsUnicode edComment modTime = toMsDosTime edModTime putWord16le (if unicode then setBit 0 11 else 0) -- ↑ general purpose bit-flag putWord16le (fromCompressionMethod edCompression) -- compression method putWord16le (msDosTime modTime) -- last mod file time putWord16le (msDosDate modTime) -- last mod file date putWord32le edCRC32 -- CRC-32 checksum putWord32le (withSaturation edCompressedSize) -- compressed size putWord32le (withSaturation edUncompressedSize) -- uncompressed size putWord16le (fromIntegral $ B.length rawName) -- file name length let zip64ef = makeZip64ExtraField headerType Zip64ExtraField { z64efUncompressedSize = edUncompressedSize, z64efCompressedSize = edCompressedSize, z64efOffset = edOffset } extraField = B.take 0xffff . runPut . putExtraField $ if needsZip64 entry || isLocalHeader then M.insert 1 zip64ef edExtraField else edExtraField putWord16le (fromIntegral $ B.length extraField) -- extra field length when isCentralDirHeader $ do putWord16le (fromIntegral $ B.length comment) -- file comment length putWord16le 0 -- disk number start putWord16le 0 -- internal file attributes putWord32le edExternalFileAttrs -- external file attributes putWord32le (withSaturation edOffset) -- relative offset of local header putByteString rawName -- file name (variable size) putByteString extraField -- extra field (variable size) when isCentralDirHeader (putByteString comment) -- file comment (variable size) -- | Create 'ByteString' representing Zip64 end of central directory record. putZip64ECD :: -- | Total number of entries Natural -> -- | Size of the central directory Natural -> -- | Offset of central directory record Natural -> Put putZip64ECD totalCount cdSize cdOffset = do putWord32le 0x06064b50 -- zip64 end of central dir signature putWord64le 44 -- size of zip64 end of central dir record putWord16le (fromVersion zipVersion) -- version made by putWord16le (fromVersion $ getZipVersion True Nothing) -- ↑ version needed to extract putWord32le 0 -- number of this disk putWord32le 0 -- number of the disk with the start of the central directory putWord64le (fromIntegral totalCount) -- total number of entries (this disk) putWord64le (fromIntegral totalCount) -- total number of entries putWord64le (fromIntegral cdSize) -- size of the central directory putWord64le (fromIntegral cdOffset) -- offset of central directory -- | Create 'ByteString' representing Zip64 end of the central directory -- locator. putZip64ECDLocator :: -- | Offset of Zip64 end of central directory Natural -> Put putZip64ECDLocator ecdOffset = do putWord32le 0x07064b50 -- zip64 end of central dir locator signature putWord32le 0 -- number of the disk with the start of the zip64 end of -- central directory putWord64le (fromIntegral ecdOffset) -- relative offset of the zip64 end -- of central directory record putWord32le 1 -- total number of disks -- | Parse end of the central directory record or Zip64 end of the central -- directory record depending on signature binary data begins with. getECD :: Get ArchiveDescription getECD = do sig <- getWord32le -- end of central directory signature let zip64 = sig == 0x06064b50 unless (sig == 0x06054b50 || sig == 0x06064b50) $ fail "Cannot locate end of central directory" zip64size <- if zip64 then do x <- getWord64le -- size of zip64 end of central directory record skip 2 -- version made by skip 2 -- version needed to extract return (Just x) else return Nothing thisDisk <- bool (fromIntegral <$> getWord16le) getWord32le zip64 -- ↑ number of this disk cdDisk <- bool (fromIntegral <$> getWord16le) getWord32le zip64 -- ↑ number of the disk with the start of the central directory unless (thisDisk == 0 && cdDisk == 0) $ fail "No support for multi-disk archives" skip (bool 2 8 zip64) -- ↑ total number of entries in the central directory on this disk skip (bool 2 8 zip64) -- ↑ total number of entries in the central directory cdSize <- bool (fromIntegral <$> getWord32le) getWord64le zip64 -- ↑ size of the central directory cdOffset <- bool (fromIntegral <$> getWord32le) getWord64le zip64 -- ↑ offset of start of central directory with respect to the starting -- disk number when zip64 . skip . fromIntegral $ fromJust zip64size - 4 -- obviously commentSize <- getWord16le -- .ZIP file comment length comment <- decodeText True <$> getBytes (fromIntegral commentSize) -- ↑ archive comment, it's uncertain how we should decide on encoding here return ArchiveDescription { adComment = if commentSize == 0 then Nothing else comment, adCDOffset = fromIntegral cdOffset, adCDSize = fromIntegral cdSize } -- | Create a 'ByteString' representing the end of central directory record. putECD :: -- | Total number of entries Natural -> -- | Size of the central directory Natural -> -- | Offset of central directory record Natural -> -- | Zip file comment Maybe Text -> Put putECD totalCount cdSize cdOffset mcomment = do putWord32le 0x06054b50 -- end of central dir signature putWord16le 0 -- number of this disk putWord16le 0 -- number of the disk with the start of the central directory putWord16le (withSaturation totalCount) -- ↑ total number of entries on this disk putWord16le (withSaturation totalCount) -- total number of entries putWord32le (withSaturation cdSize) -- size of central directory putWord32le (withSaturation cdOffset) -- offset of start of central directory let comment = maybe B.empty T.encodeUtf8 mcomment putWord16le (fromIntegral $ B.length comment) putByteString comment -- | Find the absolute offset of the end of central directory record or, if -- present, Zip64 end of central directory record. locateECD :: FilePath -> Handle -> IO (Maybe Integer) locateECD path h = sizeCheck where sizeCheck = do fsize <- hFileSize h let limit = max 0 (fsize - 0xffff - 22) if fsize < 22 then return Nothing else hSeek h SeekFromEnd (-22) >> loop limit loop limit = do sig <- getNum getWord32le 4 pos <- subtract 4 <$> hTell h let again = hSeek h AbsoluteSeek (pos - 1) >> loop limit done = pos <= limit if sig == 0x06054b50 then do result <- runMaybeT $ MaybeT (checkComment pos) >>= MaybeT . checkCDSig >>= MaybeT . checkZip64 case result of Nothing -> bool again (return Nothing) done Just ecd -> return (Just ecd) else bool again (return Nothing) done checkComment pos = do size <- hFileSize h hSeek h AbsoluteSeek (pos + 20) l <- fromIntegral <$> getNum getWord16le 2 return $ if l + 22 == size - pos then Just pos else Nothing checkCDSig pos = do hSeek h AbsoluteSeek (pos + 16) sigPos <- fromIntegral <$> getNum getWord32le 4 if sigPos == 0xffffffff -- Zip64 is probably used then return (Just pos) else do hSeek h AbsoluteSeek sigPos cdSig <- getNum getWord32le 4 return $ if cdSig == 0x02014b50 || -- ↑ normal case: central directory file header signature cdSig == 0x06064b50 || -- ↑ happens when zip 64 archive is empty cdSig == 0x06054b50 then -- ↑ happens when vanilla archive is empty Just pos else Nothing checkZip64 pos = if pos < 20 then return (Just pos) else do hSeek h AbsoluteSeek (pos - 20) zip64locatorSig <- getNum getWord32le 4 if zip64locatorSig == 0x07064b50 then do hSeek h AbsoluteSeek (pos - 12) Just . fromIntegral <$> getNum getWord64le 8 else return (Just pos) getNum f n = do result <- runGet f <$> B.hGet h n case result of Left msg -> throwM (ParsingFailed path msg) Right val -> return val ---------------------------------------------------------------------------- -- Helpers -- | Rename an entry (key) in a 'Map'. renameKey :: (Ord k) => k -> k -> Map k a -> Map k a renameKey ok nk m = case M.lookup ok m of Nothing -> m Just e -> M.insert nk e (M.delete ok m) -- | Like 'fromIntegral', but with saturation when converting to bounded -- types. withSaturation :: forall a b. (Integral a, Integral b, Bounded b) => a -> b withSaturation x = if (fromIntegral x :: Integer) > (fromIntegral bound :: Integer) then bound else fromIntegral x where bound = maxBound :: b -- | Determine the target entry of an action. targetEntry :: PendingAction -> Maybe EntrySelector targetEntry (SinkEntry _ _ s) = Just s targetEntry (CopyEntry _ _ s) = Just s targetEntry (RenameEntry s _) = Just s targetEntry (DeleteEntry s) = Just s targetEntry (Recompress _ s) = Just s targetEntry (SetEntryComment _ s) = Just s targetEntry (DeleteEntryComment s) = Just s targetEntry (SetModTime _ s) = Just s targetEntry (AddExtraField _ _ s) = Just s targetEntry (DeleteExtraField _ s) = Just s targetEntry (SetExternalFileAttributes _ s) = Just s targetEntry (SetArchiveComment _) = Nothing targetEntry DeleteArchiveComment = Nothing -- | Decode a 'ByteString'. The first argument indicates whether we should -- treat it as UTF-8 (in case bit 11 of general-purpose bit flag is set), -- otherwise the function assumes CP437. Note that since not every stream of -- bytes constitutes valid UTF-8 text, this function can fail. In that case -- 'Nothing' is returned. decodeText :: -- | Whether bit 11 of general-purpose bit flag is set Bool -> -- | Binary data to decode ByteString -> -- | Decoded 'Text' in case of success Maybe Text decodeText False = Just . decodeCP437 decodeText True = either (const Nothing) Just . T.decodeUtf8' -- | Detect if the given text needs newer Unicode-aware features to be -- properly encoded in the archive. needsUnicode :: Text -> Bool needsUnicode = not . T.all validCP437 where validCP437 x = ord x <= 127 -- | Convert numeric representation (as per the .ZIP specification) of -- version into 'Version'. toVersion :: Word16 -> Version toVersion x = makeVersion [major, minor] where (major, minor) = quotRem (fromIntegral $ x .&. 0x00ff) 10 -- | Covert 'Version' to its numeric representation as per the .ZIP -- specification. fromVersion :: Version -> Word16 fromVersion v = fromIntegral ((ZIP_OS `shiftL` 8) .|. (major * 10 + minor)) where (major, minor) = case versionBranch v of v0 : v1 : _ -> (v0, v1) v0 : _ -> (v0, 0) [] -> (0, 0) -- | Get the compression method form its numeric representation. toCompressionMethod :: Word16 -> Maybe CompressionMethod toCompressionMethod 0 = Just Store toCompressionMethod 8 = Just Deflate toCompressionMethod 12 = Just BZip2 toCompressionMethod 93 = Just Zstd toCompressionMethod _ = Nothing -- | Convert 'CompressionMethod' to its numeric representation as per the -- .ZIP specification. fromCompressionMethod :: CompressionMethod -> Word16 fromCompressionMethod Store = 0 fromCompressionMethod Deflate = 8 fromCompressionMethod BZip2 = 12 fromCompressionMethod Zstd = 93 -- | Check if an entry with these parameters needs the Zip64 extension. needsZip64 :: EntryDescription -> Bool needsZip64 EntryDescription {..} = any (>= ffffffff) [edOffset, edCompressedSize, edUncompressedSize] -- | Determine “version needed to extract” that should be written to the -- headers given the need of the Zip64 feature and the compression method. getZipVersion :: Bool -> Maybe CompressionMethod -> Version getZipVersion zip64 m = max zip64ver mver where zip64ver = makeVersion (if zip64 then [4, 5] else [2, 0]) mver = makeVersion $ case m of Nothing -> [2, 0] Just Store -> [2, 0] Just Deflate -> [2, 0] Just BZip2 -> [4, 6] Just Zstd -> [6, 3] -- | Return a decompressing 'Conduit' corresponding to the given compression -- method. decompressingPipe :: (PrimMonad m, MonadThrow m, MonadResource m) => CompressionMethod -> ConduitT ByteString ByteString m () decompressingPipe Store = C.awaitForever C.yield decompressingPipe Deflate = Z.decompress $ Z.WindowBits (-15) #ifdef ENABLE_BZIP2 decompressingPipe BZip2 = BZ.bunzip2 #else decompressingPipe BZip2 = throwM (UnsupportedCompressionMethod BZip2) #endif #ifdef ENABLE_ZSTD decompressingPipe Zstd = Zstandard.decompress #else decompressingPipe Zstd = throwM (UnsupportedCompressionMethod Zstd) #endif -- | A sink that calculates the CRC32 check sum for an incoming stream. crc32Sink :: ConduitT ByteString Void (ResourceT IO) Word32 crc32Sink = CL.fold crc32Update 0 -- | Convert 'UTCTime' to the MS-DOS time format. toMsDosTime :: UTCTime -> MsDosTime toMsDosTime UTCTime {..} = MsDosTime dosDate dosTime where dosTime = fromIntegral (seconds + shiftL minutes 5 + shiftL hours 11) dosDate = fromIntegral (day + shiftL month 5 + shiftL year 9) seconds = let (MkFixed x) = todSec tod in fromIntegral (x `quot` 2000000000000) minutes = todMin tod hours = todHour tod tod = timeToTimeOfDay utctDayTime year = fromIntegral year' - 1980 (year', month, day) = toGregorian utctDay -- | Convert MS-DOS date-time to 'UTCTime'. fromMsDosTime :: MsDosTime -> UTCTime fromMsDosTime MsDosTime {..} = UTCTime (fromGregorian year month day) (secondsToDiffTime $ hours * 3600 + minutes * 60 + seconds) where seconds = fromIntegral $ 2 * (msDosTime .&. 0x1f) minutes = fromIntegral (shiftR msDosTime 5 .&. 0x3f) hours = fromIntegral (shiftR msDosTime 11 .&. 0x1f) day = fromIntegral (msDosDate .&. 0x1f) month = fromIntegral $ shiftR msDosDate 5 .&. 0x0f year = 1980 + fromIntegral (shiftR msDosDate 9) -- We use the constants of the type 'Natural' instead of literals to protect -- ourselves from overflows on 32 bit systems. -- -- If we're in the development mode, use lower values so the tests get a -- chance to check all cases (otherwise we would need to generate way too -- big archives on CI). ffff, ffffffff :: Natural #ifdef HASKELL_ZIP_DEV_MODE ffff = 200 ffffffff = 5000 #else ffff = 0xffff ffffffff = 0xffffffff #endif -- | The default permissions for the files, permissions not set on Windows, -- and are set to rw on Unix. This mimics the behavior of the zip utility. defaultFileMode :: Word32 #ifdef mingw32_HOST_OS defaultFileMode = 0 #else defaultFileMode = Unix.fromFileMode 0o600 #endif zip-2.0.1/Codec/Archive/Zip/Type.hs0000644000000000000000000001764707346545000015141 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE DeriveDataTypeable #-} -- | -- Module : Codec.Archive.Zip.Type -- Copyright : © 2016–present Mark Karpov -- License : BSD 3 clause -- -- Maintainer : Mark Karpov -- Stability : experimental -- Portability : portable -- -- Types used by the package. module Codec.Archive.Zip.Type ( -- * Entry selector EntrySelector, mkEntrySelector, unEntrySelector, getEntryName, EntrySelectorException (..), -- * Entry description EntryDescription (..), CompressionMethod (..), -- * Archive description ArchiveDescription (..), -- * Exceptions ZipException (..), ) where import Control.Exception (Exception) import Control.Monad.Catch (MonadThrow (..)) import Data.ByteString (ByteString) import Data.ByteString qualified as B import Data.CaseInsensitive (CI) import Data.CaseInsensitive qualified as CI import Data.Data (Data) import Data.List.NonEmpty (NonEmpty) import Data.List.NonEmpty qualified as NE import Data.Map (Map) import Data.Maybe (mapMaybe) import Data.Text (Text) import Data.Text qualified as T import Data.Text.Encoding qualified as T import Data.Time.Clock (UTCTime) import Data.Typeable (Typeable) import Data.Version (Version) import Data.Word (Word16, Word32) import Numeric.Natural import System.FilePath qualified as FP import System.FilePath.Posix qualified as Posix import System.FilePath.Windows qualified as Windows ---------------------------------------------------------------------------- -- Entry selector -- | This data type serves for naming and selection of archive entries. It -- can be created only with the help of the smart constructor -- 'mkEntrySelector', and it's the only “key” that can be used to refer to -- files in the archive or to name new archive entries. -- -- The abstraction is crucial for ensuring that created archives are -- portable across operating systems, file systems, and platforms. Since on -- some operating systems, file paths are case-insensitive, this selector is -- also case-insensitive. It makes sure that only relative paths are used to -- name files inside archive, as it's recommended in the specification. It -- also guarantees that forward slashes are used when the path is stored -- inside the archive for compatibility with Unix-like operating systems (as -- recommended in the specification). On the other hand, in can be rendered -- as an ordinary relative file path in OS-specific format when needed. newtype EntrySelector = EntrySelector { -- | Path pieces of relative path inside archive unES :: NonEmpty (CI String) } deriving (Eq, Ord, Typeable) instance Show EntrySelector where show = show . unEntrySelector -- | Create an 'EntrySelector' from a 'FilePath'. To avoid problems with -- distribution of the archive, characters that some operating systems do -- not expect in paths are not allowed. -- -- Argument to 'mkEntrySelector' should pass these checks: -- -- * 'System.FilePath.Posix.isValid' -- * 'System.FilePath.Windows.isValid' -- * it is a relative path without slash at the end -- * binary representations of normalized path should be not longer than -- 65535 bytes -- -- This function can throw an 'EntrySelectorException'. mkEntrySelector :: (MonadThrow m) => FilePath -> m EntrySelector mkEntrySelector path = let f x = case filter (not . FP.isPathSeparator) x of [] -> Nothing xs -> Just (CI.mk xs) giveup = throwM (InvalidEntrySelector path) in case NE.nonEmpty (mapMaybe f (FP.splitPath path)) of Nothing -> giveup Just pieces -> let selector = EntrySelector pieces binLength = B.length . T.encodeUtf8 . getEntryName in if Posix.isValid path && Windows.isValid path && not (FP.isAbsolute path || FP.hasTrailingPathSeparator path) && (CI.mk "." `notElem` pieces) && (CI.mk ".." `notElem` pieces) && binLength selector <= 0xffff then return selector else giveup -- | Restore a relative path from 'EntrySelector'. Every 'EntrySelector' -- corresponds to a 'FilePath'. unEntrySelector :: EntrySelector -> FilePath unEntrySelector = FP.joinPath . fmap CI.original . NE.toList . unES -- | Get an entry name in the from that is suitable for writing to file -- header, given an 'EntrySelector'. getEntryName :: EntrySelector -> Text getEntryName = T.pack . concat . NE.toList . NE.intersperse "/" . fmap CI.original . unES -- | The problems you can have with an 'EntrySelector'. newtype EntrySelectorException = -- | 'EntrySelector' cannot be created from this path InvalidEntrySelector FilePath deriving (Eq, Ord, Typeable) instance Show EntrySelectorException where show (InvalidEntrySelector path) = "Cannot build selector from " ++ show path instance Exception EntrySelectorException ---------------------------------------------------------------------------- -- Entry description -- | The information about archive entry that can be stored in a zip -- archive. It does not mirror local file header or central directory file -- header, but their binary representations can be built given this data -- structure and the archive contents. data EntryDescription = EntryDescription { -- | Version made by edVersionMadeBy :: Version, -- | Version needed to extract edVersionNeeded :: Version, -- | Compression method edCompression :: CompressionMethod, -- | Last modification date and time edModTime :: UTCTime, -- | CRC32 check sum edCRC32 :: Word32, -- | Size of compressed entry edCompressedSize :: Natural, -- | Size of uncompressed entry edUncompressedSize :: Natural, -- | Absolute offset of local file header edOffset :: Natural, -- | Entry comment edComment :: Maybe Text, -- | All extra fields found edExtraField :: Map Word16 ByteString, -- | External file attributes -- -- @since 1.2.0 edExternalFileAttrs :: Word32 } deriving (Eq, Typeable) -- | The supported compression methods. data CompressionMethod = -- | Store file uncompressed Store | -- | Deflate Deflate | -- | Compressed using BZip2 algorithm BZip2 | -- | Compressed using Zstandard algorithm -- -- @since 1.6.0 Zstd deriving (Show, Read, Eq, Ord, Enum, Bounded, Data, Typeable) ---------------------------------------------------------------------------- -- Archive description -- | The information about the archive as a whole. data ArchiveDescription = ArchiveDescription { -- | The comment of the entire archive adComment :: Maybe Text, -- | Absolute offset of the start of central directory adCDOffset :: Natural, -- | The size of central directory record adCDSize :: Natural } deriving (Show, Read, Eq, Ord, Typeable, Data) ---------------------------------------------------------------------------- -- Exceptions -- | The bad things that can happen when you use the library. data ZipException = -- | Thrown when you try to get contents of non-existing entry EntryDoesNotExist FilePath EntrySelector | -- | Thrown when attempting to decompress an entry compressed with an -- unsupported compression method or the library is compiled without -- support for it. -- -- @since 2.0.0 UnsupportedCompressionMethod CompressionMethod | -- | Thrown when archive structure cannot be parsed. ParsingFailed FilePath String deriving (Eq, Ord, Typeable) instance Show ZipException where show (EntryDoesNotExist file s) = "No such entry found: " ++ show s ++ " in " ++ show file show (ParsingFailed file msg) = "Parsing of archive structure failed: \n" ++ msg ++ "\nin " ++ show file show (UnsupportedCompressionMethod method) = "Encountered a zipfile entry with " ++ show method ++ " compression, but " ++ "zip library does not support it or has been built without support for it." instance Exception ZipException zip-2.0.1/Codec/Archive/Zip/Unix.hs0000644000000000000000000000167407346545000015134 0ustar0000000000000000-- | -- Module : Codec.Archive.Zip.Unix -- Copyright : © 2016–present Mark Karpov -- License : BSD 3 clause -- -- Maintainer : Mark Karpov -- Stability : experimental -- Portability : portable -- -- Unix-specific functionality of zip archives. -- -- @since 1.4.0 module Codec.Archive.Zip.Unix ( toFileMode, fromFileMode, ) where import Data.Bits import Data.Word import System.Posix.Types (CMode (..)) -- | Convert external attributes to the file info. -- -- >>> toFileMode 2179792896 -- 0o0755 -- -- @since 1.4.0 toFileMode :: Word32 -> CMode toFileMode attrs = fromIntegral $ (attrs `shiftR` 16) .&. 0x0fff -- | Convert external attributes to the file info. The function assumes a -- regular file and keeps DOS attributes untouched. -- -- >>> fromFileMode 0o0755 -- 2179792896 -- -- @since 1.4.0 fromFileMode :: CMode -> Word32 fromFileMode cmode = (0o100000 .|. fromIntegral cmode) `shiftL` 16 zip-2.0.1/LICENSE.md0000644000000000000000000000265607346545000012122 0ustar0000000000000000Copyright © 2016–present Mark Karpov All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name Mark Karpov nor the names of contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. zip-2.0.1/README.md0000644000000000000000000002442707346545000011775 0ustar0000000000000000# Zip [![License BSD3](https://img.shields.io/badge/license-BSD3-brightgreen.svg)](http://opensource.org/licenses/BSD-3-Clause) [![Hackage](https://img.shields.io/hackage/v/zip.svg?style=flat)](https://hackage.haskell.org/package/zip) [![Stackage Nightly](http://stackage.org/package/zip/badge/nightly)](http://stackage.org/nightly/package/zip) [![Stackage LTS](http://stackage.org/package/zip/badge/lts)](http://stackage.org/lts/package/zip) ![CI](https://github.com/mrkkrp/zip/workflows/CI/badge.svg?branch=master) * [Why this library was written](#why-this-library-was-written) * [zip-archive](#zip-archive) * [LibZip](#libzip) * [zip-conduit](#zip-conduit) * [Features](#features) * [Compression methods](#compression-methods) * [Encryption](#encryption) * [Sources of file data](#sources-of-file-data) * [ZIP64](#zip64) * [Filenames](#filenames) * [Meta-information about files](#meta-information-about-files) * [Quick start](#quick-start) * [Contribution](#contribution) * [License](#license) This is a feature-rich, memory-efficient, and type-safe library to manipulate Zip archives. The library was created with large multimedia data in mind and provides all features users might expect, comparable in terms of feature-set with libraries like `libzip` in C. ## Why this library was written There are a few libraries to work with Zip archives, yet every one of them provides only a subset of useful functionality or otherwise is flawed in some way so it cannot be easily used in some situations. Let's examine all libraries available on Hackage to understand the motivation for this package. ### zip-archive `zip-archive` is a widely used library. It's quite old, well-known and simple to use. However, it creates Zip archives purely, as `ByteStrings`s in memory. This is not acceptable if you work with big data. For example, if you have a collection of files with the total size 500 MB and you want to pack them into an archive, you can easily consume up to 1 GB of memory (the files plus the resulting archive). This is not always affordable. Even if you want just to look at the list of archive entries it will read the entire archive into memory. ### LibZip This is a binding to the C library [`libzip`][libzip]. There is always a certain kind of trouble with bindings. For example, you need to ensure that the target library is installed and its version is compatible with the version of your binding. It's not that bad with libraries that do not break their API for years, but it's not the case with `libzip`. As the maintainer of `LibZip` puts it: > libzip 0.10, 0.11, and 1.0 are not binary compatible. If your C library is > 0.11.x, then you should use LibZip 0.11. If your C library is 1.0, then > you should use LibZip master branch (not yet released to Hackage). Now, on my machine I have the version 1.0. To put the package on Stackage we had to use the version 0.10, because Stackage uses Ubuntu to build packages and libraries on Ubuntu are always ancient. This means that I cannot use the version of the library from Stackage, and I don't yet know what will be on the server. After much frustration, I decided to avoid using `LibZip`. After all, this is not a project that shouldn't be done completely in Haskell. By rewriting this in Haskell, I also can make it safer to use. ### zip-conduit This one uses the right approach: leverage a good streaming library (`conduit`) for memory-efficient processing. The library is however not feature-rich and has certain problems (including the programming style, it uses `error` if an entry is missing in the archive, among other things), some of them are reported on its issue tracker. It also does not appear to be maintained (the last sign of activity was on December 23, 2014). ## Features The library supports all features specified in the modern .ZIP specification except for encryption and multi-disk archives. See more about this below. For reference, here is a [copy of the specification][specification]. ### Compression methods `zip` supports the following compression methods: * Store (no compression, just store files “as is”) * [DEFLATE] * [Bzip2] * [Zstandard] The best way to add a new compression method to the library is to write a conduit that will do the compression and publish it as a library. `zip` can then depend on it and add it to the list of supported compression methods. The current list of compression methods reflects what is available on Hackage at the moment. ### Encryption Encryption is currently not supported. Encryption system described in the .ZIP specification is known to be seriously flawed, so it's probably not the best way to protect your data anyway. The encryption method seems to be a proprietary technology of PKWARE (at least that's what stated about it in the .ZIP specification), so to hell with it. ### Sources of file data The following sources are supported: * *File name.* This is an efficient method to perform compression or decompression. You specify where to get data or where to save it and the rest is handled by the library. * *Conduit source or sink.* * *ByteString.* Use it only with small data. * *Copy file from another archive.* An efficient operation, file is copied “as is”—no re-compression is performed. ### ZIP64 When necessary, the `ZIP64` extension is automatically used. It's necessary when: * The total size of the archive is greater than 4 GB. * The size of a single compressed/uncompressed file in the archive is greater than 4 GB. * There are more than 65535 entries in the archive. The library is particularly well suited for processing large files. For example, I've been able to create 6.5 GB archive with reasonable speed and without significant memory consumption. ### Filenames The library has an API that makes it impossible to create archive with non-portable or invalid file names in it. As of .ZIP specification 6.3.2, files with Unicode symbols in their names can be stored in Zip archives. The library supports mechanisms for this and uses them automatically when needed. Besides UTF-8, CP437 is also supported as per the specification. ### Meta-information about files The library allows us to attach comments to the entire archive or individual files, and also gives its user full control over extra fields that are written to file headers, so the user can store arbitrary information about files in the archive. ## Quick start The module `Codec.Archive.Zip` provides everything you may need to manipulate Zip archives. There are three things that should be clarified right away to avoid confusion. First, we use the `EntrySelector` type that can be obtained from relative `FilePath`s (paths to directories are not allowed). This method may seem awkward at first, but it will protect you from the problems with portability when your archive is unpacked on a different platform. Second, there is no way to add directories, or to be precise, *empty directories* to your archive. This approach is used in Git and I find it sane. Finally, the third feature of the library is that it does not modify the archive instantly, because doing so on every manipulation would often be inefficient. Instead, we maintain a collection of pending actions that can be turned into an optimized procedure that efficiently modifies the archive in one pass. Normally, this should be of no concern to you, because all actions are performed automatically when you leave the `ZipArchive` monad. If, however, you ever need to force an update, the `commit` function is your friend. Let's take a look at some examples that show how to accomplish most common tasks. To get full information about archive entries, use `getEntries`: ```haskell λ> withArchive archivePath (M.keys <$> getEntries) ``` This will return a list of all entries in the archive at `archivePath`. It's possible to extract contents of an entry as a strict `ByteString`: ```haskell λ> withArchive archivePath (getEntry entrySelector) ``` …to stream them to a given sink: ```haskell λ> withArchive archivePath (sourceEntry entrySelector mySink) ``` …to save a specific entry to a file: ```haskell λ> withArchive archivePath (saveEntry entrySelector pathToFile) ``` …and finally just unpack the entire archive into a directory: ```haskell λ> withArchive archivePath (unpackInto destDir) ``` See also `getArchiveComment` and `getArchiveDescription`. Modifying is also easy. When you want to create a new archive use `createArchive`, otherwise `withArchive` will do. To add an entry from `ByteString`: ```haskell λ> createArchive archivePath (addEntry Store "Hello, World!" entrySelector) ``` You can stream from `Source` as well: ```haskell λ> createArchive archivePath (sinkEntry Deflate source entrySelector) ``` To add contents from a file, use `loadEntry`: ```haskell λ> let toSelector = const (mkEntrySelector "my-entry.txt") λ> createArchive archivePath (loadEntry BZip2 toSelector myFilePath) ``` Finally, you can copy an entry from another archive without re-compression (unless you use `recompress`, see below): ```haskell λ> createArchive archivePath (copyEntry srcArchivePath selector selector) ``` It's often desirable to just pack a directory: ```haskell λ> createArchive archivePath (packDirRecur Deflate mkEntrySelector dir) ``` It's also possible to: * rename an entry with `renameEntry` * delete an entry with `deleteEntry` * change compression method with `recompress` * change comment associated with an entry with `setEntryComment` * delete comment with `deleteEntryComment` * set modification time with `setModTime` * manipulate extra fields with `addExtraField` and `deleteExtraField` * check if entry is intact with `checkEntry` * undo changes with `undoEntryCanges`, `undoArchiveChanges`, and `undoAll` * force changes to be written to file system with `commit` This should cover all your needs. Feel free to open an issue if you're missing something. ## Contribution You can contact the maintainer via [the issue tracker](https://github.com/mrkkrp/zip/issues). Pull requests are welcome. ## License Copyright © 2016–present Mark Karpov Distributed under BSD 3 clause license. [libzip]: https://en.wikipedia.org/wiki/Libzip [specification]: https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.3.TXT [DEFLATE]: https://en.wikipedia.org/wiki/DEFLATE [Bzip2]: https://en.wikipedia.org/wiki/Bzip2 [Zstandard]: https://en.wikipedia.org/wiki/Zstandard zip-2.0.1/bench-app/0000755000000000000000000000000007346545000012342 5ustar0000000000000000zip-2.0.1/bench-app/Main.hs0000644000000000000000000000075607346545000013572 0ustar0000000000000000module Main (main) where import Codec.Archive.Zip import System.Environment (getArgs) import System.Exit (exitFailure) import System.FilePath main :: IO () main = do [operation, input, output] <- getArgs case operation of "compress" -> do selector <- mkEntrySelector (takeFileName input) createArchive output (loadEntry Deflate selector input) "uncompress" -> withArchive input (unpackInto output) _ -> do putStrLn "Unknown command." exitFailure zip-2.0.1/tests/0000755000000000000000000000000007346545000011647 5ustar0000000000000000zip-2.0.1/tests/Main.hs0000644000000000000000000007047507346545000013104 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE LambdaCase #-} {-# LANGUAGE OverloadedStrings #-} {-# OPTIONS_GHC -fno-warn-orphans #-} module Main (main) where import Codec.Archive.Zip import Codec.Archive.Zip.CP437 import Codec.Archive.Zip.Unix import Control.Monad import Control.Monad.IO.Class import Data.Bits import Data.ByteString (ByteString) import Data.ByteString qualified as B import Data.ByteString.Builder qualified as LB import Data.ByteString.Lazy qualified as LB import Data.Conduit qualified as C import Data.Conduit.List qualified as CL import Data.DList qualified as DList import Data.List (intercalate) import Data.Map (Map, (!)) import Data.Map.Strict qualified as M import Data.Maybe (fromJust) import Data.Set qualified as E import Data.Text (Text) import Data.Text qualified as T import Data.Text.Encoding qualified as T import Data.Time import Data.Version import Data.Word import System.Directory import System.FilePath (()) import System.FilePath qualified as FP import System.IO import System.IO.Error (isDoesNotExistError) import System.IO.Temp import Test.Hspec import Test.QuickCheck hiding ((.&.)) -- | Zip tests. Please note that the Zip64 feature is not currently tested -- automatically because we'd need > 4GB of data. Handling such quantities -- of data locally is problematic and even more problematic on CI. main :: IO () main = hspec $ do describe "mkEntrySelector" mkEntrySelectorSpec describe "unEntrySelector" unEntrySelectorSpec describe "getEntryName" getEntryNameSpec describe "decodeCP437" decodeCP437Spec describe "fromFileMode" fromFileModeSpec around withSandbox $ do describe "createArchive" createArchiveSpec describe "withArchive" withArchiveSpec describe "archive comment" archiveCommentSpec describe "getEntryDesc" getEntryDescSpec describe "version needed" versionNeededSpec describe "addEntry" addEntrySpec describe "sinkEntry" sinkEntrySpec describe "loadEntry" loadEntrySpec describe "copyEntry" copyEntrySpec describe "checkEntry" checkEntrySpec describe "recompress" recompressSpec describe "entry comment" entryCommentSpec describe "setModTime" setModTimeSpec describe "extra field" extraFieldSpec describe "setExternalFileAttrsSpec" setExternalFileAttrsSpec describe "renameEntry" renameEntrySpec describe "deleteEntry" deleteEntrySpec describe "forEntries" forEntriesSpec describe "undoEntryChanges" undoEntryChangesSpec describe "undoArchiveChanges" undoArchiveChangesSpec describe "undoAll" undoAllSpec describe "consistency" consistencySpec describe "packDirRecur'" packDirRecur'Spec describe "unpackInto" unpackIntoSpec ---------------------------------------------------------------------------- -- Arbitrary instances and generators instance Arbitrary Text where arbitrary = T.pack <$> listOf1 arbitrary instance Arbitrary ByteString where arbitrary = B.pack <$> listOf arbitrary {- ORMOLU_DISABLE -} instance Arbitrary CompressionMethod where arbitrary = elements [ Store, #ifdef ENABLE_BZIP2 BZip2, #endif #ifdef ENABLE_ZSTD Zstd, #endif Deflate ] {- ORMOLU_ENABLE -} instance Arbitrary UTCTime where arbitrary = UTCTime <$> (ModifiedJulianDay <$> choose (44239, 90989)) <*> (secondsToDiffTime <$> choose (0, 86399)) newtype RelPath = RelPath FilePath instance Show RelPath where show (RelPath path) = show path instance Arbitrary RelPath where arbitrary = do p <- resize 10 $ intercalate "/" <$> listOf1 ( (++) <$> vectorOf 3 charGen <*> listOf1 charGen ) case mkEntrySelector p of Nothing -> arbitrary Just _ -> return (RelPath p) instance Arbitrary EntrySelector where arbitrary = do RelPath x <- arbitrary case mkEntrySelector x of Nothing -> arbitrary Just s -> return s data EM = EM EntrySelector EntryDescription (ZipArchive ()) deriving (Show) instance Arbitrary EM where arbitrary = do s <- arbitrary method <- arbitrary content <- arbitrary modTime <- arbitrary comment <- arbitrary externalFileAttrs <- arbitrary extraFieldTag <- arbitrary `suchThat` (/= 1) extraFieldContent <- arbitrary `suchThat` ((< 0xffff) . B.length) let action = do addEntry method content s setModTime modTime s setEntryComment comment s addExtraField extraFieldTag extraFieldContent s setExternalFileAttrs externalFileAttrs s return $ EM s EntryDescription { edVersionMadeBy = undefined, edVersionNeeded = undefined, edCompression = method, edModTime = modTime, edCRC32 = undefined, edCompressedSize = undefined, edUncompressedSize = fromIntegral (B.length content), edOffset = undefined, edComment = Just comment, edExtraField = M.singleton extraFieldTag extraFieldContent, edExternalFileAttrs = externalFileAttrs } action data EC = EC (Map EntrySelector EntryDescription) (ZipArchive ()) deriving (Show) instance Arbitrary EC where arbitrary = do let f (EM s d z) = (s, (d, z)) m <- M.fromList . fmap f <$> downScale (listOf arbitrary) return (EC (M.map fst m) (sequence_ $ snd <$> M.elems m)) charGen :: Gen Char charGen = frequency [ (3, choose ('a', 'z')), (3, choose ('A', 'Z')), (3, choose ('0', '9')), (1, arbitrary `suchThat` (>= ' ')) ] binASCII :: Gen ByteString binASCII = LB.toStrict . LB.toLazyByteString <$> go where go = frequency [ (10, (<>) <$> (LB.word8 <$> choose (0, 127)) <*> go), (1, return mempty) ] instance Show EntryDescription where show ed = "{ edCompression = " ++ show (edCompression ed) ++ "\n, edModTime = " ++ show (edModTime ed) ++ "\n, edUncompressedSize = " ++ show (edUncompressedSize ed) ++ "\n, edComment = " ++ show (edComment ed) ++ "\n, edExtraField = " ++ show (edExtraField ed) ++ "\n, edExtFileAttr = " ++ show (edExternalFileAttrs ed) ++ " }" instance Show (ZipArchive a) where show = const "" ---------------------------------------------------------------------------- -- Pure operations and periphery mkEntrySelectorSpec :: Spec mkEntrySelectorSpec = do let rejects x = mkEntrySelector x `shouldThrow` isEntrySelectorException x accepts x = do s <- mkEntrySelector x getEntryName s `shouldBe` T.pack x context "when absolute paths are passed" $ it "they are rejected" $ property $ \(RelPath x) -> rejects ('/' : x) context "when paths with trailing path separator are passed" $ it "they are rejected" $ do rejects "foo/" rejects "foo/bar/" context "when paths with dot as path segment are passed" $ it "they are rejected" $ do rejects "./foo/bar" rejects "foo/./bar" rejects "foo/bar/." context "when paths with double dot as path segment are passed" $ it "they are rejected" $ do rejects "../foo/bar" rejects "foo/../bar" rejects "foo/bar/.." context "when too long paths are passed" $ it "rejects them" $ do let path = replicate 0x10000 'a' mkEntrySelector path `shouldThrow` isEntrySelectorException path context "when correct paths are passed" $ it "adequately represents them" $ do accepts "foo" accepts "one/two/three" accepts "something.txt" unEntrySelectorSpec :: Spec unEntrySelectorSpec = context "when entry selector exists" $ it "has corresponding path" . property $ \s -> not . null . unEntrySelector $ s getEntryNameSpec :: Spec getEntryNameSpec = context "when entry selector exists" $ it "has corresponding representation" . property $ \s -> not . T.null . getEntryName $ s decodeCP437Spec :: Spec decodeCP437Spec = do context "when ASCII-compatible subset is used" $ it "has the same result as decoding UTF-8" . property $ forAll binASCII $ \bin -> decodeCP437 bin `shouldBe` T.decodeUtf8 bin context "when non-ASCII subset is used" $ it "is decoded correctly" $ do let c b t = decodeCP437 (B.pack b) `shouldBe` t c [0x80 .. 0x9f] "ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£¥₧ƒ" c [0xa0 .. 0xbf] "áíóúñѪº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛┐" c [0xc0 .. 0xdf] "└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╫╪┘┌█▄▌▐▀" c [0xe0 .. 0xff] "αßΓπΣσµτΦΘΩδ∞φε∩≡±≥≤⌠⌡÷≈°∙·√ⁿ²■ " fromFileModeSpec :: Spec fromFileModeSpec = context "UNIX helpers" $ do it "toFileMode . fromFileMode == id .&. 0x0fffff" . property $ \mode -> (toFileMode . fromFileMode) (fromIntegral mode) == fromIntegral (mode .&. (0x0fff :: Word16)) it "toFileMode == toFileMode . fromFileMode . toFileMode" . property $ \mode -> toFileMode mode == (toFileMode . fromFileMode . toFileMode) mode ---------------------------------------------------------------------------- -- Primitive editing/querying actions createArchiveSpec :: SpecWith FilePath createArchiveSpec = do context "when called with non-existent path and empty recipe" $ it "creates correct representation of empty archive" $ \path -> do createArchive path (return ()) B.readFile path `shouldReturn` emptyArchive context "when called with an occupied path" $ it "overwrites it" $ \path -> do B.writeFile path B.empty createArchive path (return ()) B.readFile path `shouldNotReturn` B.empty withArchiveSpec :: SpecWith FilePath withArchiveSpec = do context "when called with non-existent path" $ it "throws 'isDoesNotExistError' exception" $ \path -> withArchive path (return ()) `shouldThrow` isDoesNotExistError context "when called with occupied path (empty file)" $ it "throws 'ParsingFailed' exception" $ \path -> do B.writeFile path B.empty withArchive path (return ()) `shouldThrow` isParsingFailed path "Cannot locate end of central directory" context "when called with occupied path (empty archive)" $ it "does not overwrite the file unnecessarily" $ \path -> do B.writeFile path emptyArchive withArchive path $ liftIO $ B.writeFile path B.empty B.readFile path `shouldNotReturn` emptyArchive archiveCommentSpec :: SpecWith FilePath archiveCommentSpec = do context "when new archive is created" $ it "returns no archive comment" $ \path -> createArchive path getArchiveComment `shouldReturn` Nothing context "when comment contains end of central directory signature" $ it "reads it without problems" $ \path -> do entries <- createArchive path $ do setArchiveComment "I saw you want to have PK\ENQ\ACK here." commit getEntries entries `shouldBe` M.empty context "when comment is committed (delete/set)" $ it "reads it and updates" $ \path -> property $ \txt -> do comment <- createArchive path $ do deleteArchiveComment setArchiveComment txt commit getArchiveComment comment `shouldBe` Just txt context "when comment is committed (set/delete)" $ it "reads it and updates" $ \path -> property $ \txt -> do comment <- createArchive path $ do setArchiveComment txt deleteArchiveComment commit getArchiveComment comment `shouldBe` Nothing context "when pre-existing comment is overwritten" $ it "returns the new comment" $ \path -> property $ \txt txt' -> do comment <- createArchive path $ do setArchiveComment txt commit setArchiveComment txt' commit getArchiveComment comment `shouldBe` Just txt' context "when pre-existing comment is deleted" $ it "actually deletes it" $ \path -> property $ \txt -> do comment <- createArchive path $ do setArchiveComment txt commit deleteArchiveComment commit getArchiveComment comment `shouldBe` Nothing getEntryDescSpec :: SpecWith FilePath getEntryDescSpec = it "always returns correct description" $ \path -> property $ \(EM s desc z) -> do desc' <- fromJust <$> createArchive path (z >> commit >> getEntryDesc s) desc' `shouldSatisfy` softEq desc versionNeededSpec :: SpecWith FilePath versionNeededSpec = it "writes correct version that is needed to extract archive" $ \path -> -- NOTE for now we check only how version depends on compression method, -- it should be mentioned that the version also depends on Zip64 feature property $ \(EM s desc z) -> do desc' <- fromJust <$> createArchive path (z >> commit >> getEntryDesc s) edVersionNeeded desc' `shouldBe` makeVersion ( case edCompression desc of Store -> [2, 0] Deflate -> [2, 0] BZip2 -> [4, 6] Zstd -> [6, 3] ) addEntrySpec :: SpecWith FilePath addEntrySpec = context "when an entry is added" $ it "is there" $ \path -> property $ \m b s -> do info <- createArchive path $ do addEntry m b s commit (,) <$> getEntry s <*> (edCompression . (! s) <$> getEntries) info `shouldBe` (b, m) sinkEntrySpec :: SpecWith FilePath sinkEntrySpec = context "when an entry is sunk" $ it "is there" $ \path -> property $ \m b s -> do info <- createArchive path $ do sinkEntry m (C.yield b) s commit (,) <$> sourceEntry s (CL.foldMap id) <*> (edCompression . (! s) <$> getEntries) info `shouldBe` (b, m) loadEntrySpec :: SpecWith FilePath loadEntrySpec = context "when an entry is loaded" $ it "is there" $ \path -> property $ \m b s t -> do let vpath = deriveVacant path B.writeFile vpath b setModificationTime vpath t createArchive path $ do loadEntry m s vpath commit liftIO (removeFile vpath) saveEntry s vpath B.readFile vpath `shouldReturn` b modTime <- getModificationTime vpath modTime `shouldSatisfy` isCloseTo t copyEntrySpec :: SpecWith FilePath copyEntrySpec = context "when entry is copied form another archive" $ it "is there" $ \path -> property $ \m b s -> do let vpath = deriveVacant path createArchive vpath (addEntry m b s) info <- createArchive path $ do copyEntry vpath s s commit (,) <$> getEntry s <*> (edCompression . (! s) <$> getEntries) info `shouldBe` (b, m) checkEntrySpec :: SpecWith FilePath checkEntrySpec = do context "when entry is intact" $ it "passes the check" $ \path -> property $ \m b s -> do check <- createArchive path $ do addEntry m b s commit checkEntry s check `shouldBe` True context "when entry is corrupted" $ it "does not pass the check" $ \path -> property $ \b s -> not (B.null b) ==> do let headerLength = 50 + (B.length . T.encodeUtf8 . getEntryName $ s) localFileHeaderOffset <- createArchive path $ do addEntry Store b s commit fromIntegral . edOffset . (! s) <$> getEntries withFile path ReadWriteMode $ \h -> do hSeek h AbsoluteSeek (localFileHeaderOffset + fromIntegral headerLength) byte <- B.map complement <$> B.hGet h 1 hSeek h RelativeSeek (-1) B.hPut h byte withArchive path (checkEntry s) `shouldReturn` False recompressSpec :: SpecWith FilePath recompressSpec = context "when recompression is used" $ it "gets recompressed" $ \path -> property $ \m m' b s -> do info <- createArchive path $ do addEntry m b s commit recompress m' s commit (,) <$> getEntry s <*> (edCompression . (! s) <$> getEntries) info `shouldBe` (b, m') entryCommentSpec :: SpecWith FilePath entryCommentSpec = do context "when comment is committed (delete/set)" $ it "reads it and updates" $ \path -> property $ \txt s -> do comment <- createArchive path $ do addEntry Store "foo" s deleteEntryComment s setEntryComment txt s commit edComment . (! s) <$> getEntries comment `shouldBe` Just txt context "when comment is committed (set/delete)" $ it "reads it and updates" $ \path -> property $ \txt s -> do comment <- createArchive path $ do addEntry Store "foo" s setEntryComment txt s deleteEntryComment s commit edComment . (! s) <$> getEntries comment `shouldBe` Nothing context "when pre-existing comment is overwritten" $ it "returns the new comment" $ \path -> property $ \txt txt' s -> do comment <- createArchive path $ do addEntry Store "foo" s setEntryComment txt s commit setEntryComment txt' s commit edComment . (! s) <$> getEntries comment `shouldBe` Just txt' context "when pre-existing comment is deleted" $ it "actually deletes it" $ \path -> property $ \txt s -> do comment <- createArchive path $ do addEntry Store "foo" s setEntryComment txt s commit deleteEntryComment s commit edComment . (! s) <$> getEntries comment `shouldBe` Nothing setModTimeSpec :: SpecWith FilePath setModTimeSpec = do context "when mod time is set (after creation)" $ it "reads it and updates" $ \path -> property $ \time s -> do modTime <- createArchive path $ do addEntry Store "foo" s setModTime time s commit edModTime . (! s) <$> getEntries modTime `shouldSatisfy` isCloseTo time context "when mod time is set (before creation)" $ it "has no effect" $ \path -> property $ \time time' s -> not (isCloseTo time time') ==> do modTime <- createArchive path $ do setModTime time s addEntry Store "foo" s commit edModTime . (! s) <$> getEntries modTime `shouldNotSatisfy` isCloseTo time extraFieldSpec :: SpecWith FilePath extraFieldSpec = do context "when extra field is committed (delete/set)" $ it "reads it and updates" $ \path -> property $ \n b s -> n /= 1 ==> do efield <- createArchive path $ do addEntry Store "foo" s deleteExtraField n s addExtraField n b s commit M.lookup n . edExtraField . (! s) <$> getEntries efield `shouldBe` Just b context "when extra field is committed (set/delete)" $ it "reads it and updates" $ \path -> property $ \n b s -> n /= 1 ==> do efield <- createArchive path $ do addEntry Store "foo" s addExtraField n b s deleteExtraField n s commit M.lookup n . edExtraField . (! s) <$> getEntries efield `shouldBe` Nothing context "when pre-existing extra field is overwritten" $ it "reads it and updates" $ \path -> property $ \n b b' s -> n /= 1 ==> do efield <- createArchive path $ do addEntry Store "foo" s addExtraField n b s commit addExtraField n b' s commit M.lookup n . edExtraField . (! s) <$> getEntries efield `shouldBe` Just b' context "when pre-existing extra field is deleted" $ it "actually deletes it" $ \path -> property $ \n b s -> n /= 1 ==> do efield <- createArchive path $ do addEntry Store "foo" s addExtraField n b s commit deleteExtraField n s commit M.lookup n . edExtraField . (! s) <$> getEntries efield `shouldBe` Nothing setExternalFileAttrsSpec :: SpecWith FilePath setExternalFileAttrsSpec = context "when an external file attribute is added (after creation)" $ it "sets a custom external file attribute" $ \path -> property $ \attr s -> do attr' <- createArchive path $ do addEntry Store "foo" s setExternalFileAttrs attr s commit edExternalFileAttrs . (! s) <$> getEntries attr' `shouldBe` attr renameEntrySpec :: SpecWith FilePath renameEntrySpec = do context "when renaming after editing of new entry" $ it "produces correct result" $ \path -> property $ \(EM s desc z) s' -> do desc' <- createArchive path $ do z renameEntry s s' commit (! s') <$> getEntries desc' `shouldSatisfy` softEq desc context "when renaming existing entry" $ it "gets renamed" $ \path -> property $ \(EM s desc z) s' -> do desc' <- createArchive path $ do z commit renameEntry s s' commit (! s') <$> getEntries desc' `shouldSatisfy` softEq desc deleteEntrySpec :: SpecWith FilePath deleteEntrySpec = do context "when deleting after editing of new entry" $ it "produces correct result" $ \path -> property $ \(EM s _ z) -> do member <- createArchive path $ do z deleteEntry s commit doesEntryExist s member `shouldBe` False context "when deleting existing entry" $ it "gets deleted" $ \path -> property $ \(EM s _ z) -> do member <- createArchive path $ do z commit deleteEntry s commit doesEntryExist s member `shouldBe` False forEntriesSpec :: SpecWith FilePath forEntriesSpec = it "affects all existing entries" $ \path -> property $ \(EC m z) txt -> do m' <- createArchive path $ do z commit forEntries (setEntryComment txt) commit getEntries let f ed = ed {edComment = Just txt} m' `shouldSatisfy` softEqMap (M.map f m) undoEntryChangesSpec :: SpecWith FilePath undoEntryChangesSpec = it "cancels all actions for specified entry" $ \path -> property $ \(EM s _ z) -> do member <- createArchive path $ do z undoEntryChanges s commit doesEntryExist s member `shouldBe` False undoArchiveChangesSpec :: SpecWith FilePath undoArchiveChangesSpec = do it "cancels archive comment editing" $ \path -> property $ \txt -> do comment <- createArchive path $ do setArchiveComment txt undoArchiveChanges commit getArchiveComment comment `shouldBe` Nothing it "cancels archive comment deletion" $ \path -> property $ \txt -> do comment <- createArchive path $ do setArchiveComment txt commit deleteArchiveComment undoArchiveChanges commit getArchiveComment comment `shouldBe` Just txt undoAllSpec :: SpecWith FilePath undoAllSpec = it "cancels all editing at once" $ \path -> property $ \(EC _ z) txt -> do createArchive path (return ()) withArchive path $ do z setArchiveComment txt undoAll liftIO (B.writeFile path B.empty) B.readFile path `shouldReturn` B.empty ---------------------------------------------------------------------------- -- Complex construction/restoration consistencySpec :: SpecWith FilePath consistencySpec = it "can save and restore arbitrary archive" $ \path -> property $ \(EC m z) txt -> do (txt', m') <- createArchive path $ do z setArchiveComment txt commit (,) <$> getArchiveComment <*> getEntries txt' `shouldBe` Just txt m' `shouldSatisfy` softEqMap m packDirRecur'Spec :: SpecWith FilePath packDirRecur'Spec = it "packs arbitrary directory recursively" $ \path -> property $ forAll (downScale arbitrary) $ \contents -> withSystemTempDirectory "zip-sandbox" $ \dir -> do forM_ contents $ \s -> do let item = dir unEntrySelector s createDirectoryIfMissing True (FP.takeDirectory item) B.writeFile item "foo" let magicFileAttrs = 123456789 entries <- createArchive path $ do packDirRecur' Store mkEntrySelector (setExternalFileAttrs magicFileAttrs) dir commit getEntries M.keysSet entries `shouldBe` E.fromList contents forM_ (M.elems entries) $ \desc -> edExternalFileAttrs desc `shouldBe` magicFileAttrs unpackIntoSpec :: SpecWith FilePath unpackIntoSpec = it "unpacks archive contents into directory" $ \path -> property $ \(EC m z) -> withSystemTempDirectory "zip-sandbox" $ \dir -> do createArchive path $ do z commit unpackInto dir selectors <- listDirRecur dir >>= mapM mkEntrySelector let x = E.fromList selectors y = M.keysSet m E.difference x y `shouldBe` E.empty ---------------------------------------------------------------------------- -- Helpers -- | Change the size parameter of a generator by dividing it by 2. downScale :: Gen a -> Gen a downScale = scale (`div` 2) -- | Check whether a given exception is 'EntrySelectorException' with a -- specific path inside. isEntrySelectorException :: FilePath -> EntrySelectorException -> Bool isEntrySelectorException path (InvalidEntrySelector p) = p == path -- | Check whether a given exception is 'ParsingFailed' exception with a -- specific path and error message inside. isParsingFailed :: FilePath -> String -> ZipException -> Bool isParsingFailed path msg (ParsingFailed path' msg') = path == path' && msg == msg' isParsingFailed _ _ _ = False -- | Create a sandbox directory to model some situation in it and run some -- tests. Note that we're using a new unique sandbox directory for each test -- case to avoid contamination and it's unconditionally deleted after the -- test case finishes. The function returns a vacant file path in that -- directory. withSandbox :: ActionWith FilePath -> IO () withSandbox action = withSystemTempDirectory "zip-sandbox" $ \dir -> action (dir "foo.zip") -- | Given a primary name (name of archive), generate a name that does not -- collide with it. deriveVacant :: FilePath -> FilePath deriveVacant = ( "bar") . FP.takeDirectory -- | Compare times forgiving a minor difference. isCloseTo :: UTCTime -> UTCTime -> Bool isCloseTo a b = abs (diffUTCTime a b) < 2 -- | Compare for equality taking into account only some fields of the -- 'EntryDescription' record. softEq :: EntryDescription -> EntryDescription -> Bool softEq a b = edCompression a == edCompression b && isCloseTo (edModTime a) (edModTime b) && edUncompressedSize a == edUncompressedSize b && edComment a == edComment b && M.delete 1 (edExtraField a) == M.delete 1 (edExtraField b) -- | Compare two maps describing archive entries in such a way that only -- some fields in 'EntryDescription' record are tested. softEqMap :: Map EntrySelector EntryDescription -> Map EntrySelector EntryDescription -> Bool softEqMap n m = M.null (M.differenceWith f n m) where f a b = if softEq a b then Nothing else Just a -- | The canonical representation of an empty Zip archive. emptyArchive :: ByteString emptyArchive = B.pack [ 0x50, 0x4b, 0x05, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 ] -- | Recursively list a directory. Do not return paths to empty directories. listDirRecur :: FilePath -> IO [FilePath] listDirRecur path = DList.toList <$> go "" where go adir = do let cdir = path adir raw <- listDirectory cdir fmap mconcat . forM raw $ \case "" -> return mempty "." -> return mempty ".." -> return mempty x -> do let fullx = cdir x adir' = adir x isFile <- doesFileExist fullx isDir <- doesDirectoryExist fullx if isFile then return (DList.singleton adir') else if isDir then go adir' else return mempty zip-2.0.1/zip.cabal0000644000000000000000000000736407346545000012305 0ustar0000000000000000cabal-version: 2.4 name: zip version: 2.0.1 license: BSD-3-Clause license-file: LICENSE.md maintainer: Mark Karpov author: Mark Karpov tested-with: ghc ==9.4.7 ghc ==9.6.3 ghc ==9.8.1 homepage: https://github.com/mrkkrp/zip bug-reports: https://github.com/mrkkrp/zip/issues synopsis: Operations on zip archives description: Operations on zip archives. category: Codec build-type: Simple extra-doc-files: CHANGELOG.md README.md source-repository head type: git location: https://github.com/mrkkrp/zip.git flag dev description: Turn on development settings. default: False manual: True flag disable-bzip2 description: Removes dependency on bzip2 C library and hence support for BZip2 entries. default: False manual: True flag disable-zstd description: Removes dependency on zstd C library and hence support for Zstandard entries. default: False manual: True library exposed-modules: Codec.Archive.Zip Codec.Archive.Zip.CP437 Codec.Archive.Zip.Unix other-modules: Codec.Archive.Zip.Internal Codec.Archive.Zip.Type default-language: GHC2021 build-depends: base >=4.15 && <5, bytestring >=0.9 && <0.13, case-insensitive >=1.2.0.2 && <1.3, cereal >=0.3 && <0.6, conduit >=1.3 && <1.4, conduit-extra >=1.3 && <1.4, containers >=0.5 && <0.7, digest <0.1, directory >=1.2.2 && <1.4, dlist >=0.8 && <2.0, exceptions >=0.6 && <0.11, filepath >=1.2 && <1.5, monad-control >=1.0 && <1.1, mtl >=2 && <3, resourcet >=1.2 && <1.4, text >=0.2 && <2.2, time >=1.4 && <1.13, transformers >=0.4 && <0.7, transformers-base if !flag(disable-bzip2) build-depends: bzlib-conduit >=0.3 && <0.4 if !flag(disable-zstd) build-depends: conduit-zstd >=0.0.2 && <0.1 if flag(dev) cpp-options: -DHASKELL_ZIP_DEV_MODE ghc-options: -Wall -Werror -Wredundant-constraints -Wpartial-fields -Wunused-packages else ghc-options: -O2 -Wall if !flag(disable-bzip2) cpp-options: -DENABLE_BZIP2 if !flag(disable-zstd) cpp-options: -DENABLE_ZSTD if os(windows) cpp-options: -DZIP_OS=0 else cpp-options: -DZIP_OS=3 build-depends: unix <2.9 executable haskell-zip-app main-is: Main.hs hs-source-dirs: bench-app default-language: GHC2021 build-depends: base >=4.15 && <5, filepath >=1.2 && <1.5, zip if flag(dev) ghc-options: -Wall -Werror -Wredundant-constraints -Wpartial-fields -Wunused-packages else ghc-options: -O2 -Wall test-suite tests type: exitcode-stdio-1.0 main-is: Main.hs hs-source-dirs: tests default-language: GHC2021 build-depends: base >=4.15 && <5, QuickCheck >=2.4 && <3, bytestring >=0.9 && <0.13, conduit >=1.3 && <1.4, containers >=0.5 && <0.7, directory >=1.2.2 && <1.4, dlist >=0.8 && <2, filepath >=1.2 && <1.5, hspec >=2 && <3, temporary >=1.1 && <1.4, text >=0.2 && <2.2, time >=1.4 && <1.13, zip if flag(dev) ghc-options: -Wall -Werror -Wredundant-constraints -Wpartial-fields -Wunused-packages else ghc-options: -O2 -Wall if !flag(disable-bzip2) cpp-options: -DENABLE_BZIP2 if !flag(disable-zstd) cpp-options: -DENABLE_ZSTD