How often do you work with the ZIP files in your day-to-day life?

If you ever worked with ZIP files, then you would know that a lot of files and directories are compressed together into one file that has a .zip file extension.

So, in order to read that files, we need to extract them from ZIP format.

In this tutorial, we gonna implement some Pythonic ways to perform various operations on ZIP files without even extracting them.

For that purpose, we gonna use Python's zipfile module to handle the process for us nice and easy.

What is a ZIP file?

As mentioned above, a ZIP file contains one or more files or directories that have been compressed.

ZIP is an archive file format that supports lossless data compression.

Lossless compression means that the original data will be perfectly reconstructed from the compressed data without even losing any information.

If you wonder what is an archive file, then it is nothing but computer files which are composed of one or more files along with their metadata.

This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP.Source


Illustration to show how the files are placed on the disk.Source

What is the need for a ZIP file?

ZIP files can be crucial for those who work with computers and deal with large digital information because it allows them to

  • Reduce the storage requirement by compressing the size of the files without the loss of any information.

  • Improve transfer speed over the network.

  • Accumulate all your related files into one archive for better management.

  • Provides security by encrypting the files.

How to manipulate ZIP files using Python?

Python provides multiple tools to manipulate ZIP files which includes some low-level Python libraries such as lzma, bz2, zlib, tarfile and many others that help in compressing and decompressing files using specific compression algorithms.

Apart from these Python has a high-level module called zipfile that helps us to read, write, create, extract and list the content of ZIP files.

Python's zipfile

zipfile module does provide convinient classes and functions for reading, writing, extracting the ZIP files.

But it does have limitations too like:

  • Data decryption process is slow because it runs on pure Python.

  • It can't handle the creation of encrypted ZIP files.

  • Use of multi-disk ZIP files isn't supported currently.

Opening ZIP files for Reading & Writing

zipfile has a class ZipFile that allows us to open ZIP files in different modes and works exactly as Python's open() function.

There are four types of modes available -

  • r: Opens file in a reading mode. Default

  • w: Writing mode.

  • a: Append to an existing file.

  • x: Create and write a new file.

ZipFile is also a context manager and therefore supports the with statement.Source

import zipfile

with zipfile.ZipFile("", mode="r") as arch:

File Name                                             Modified             Size
document.txt                                   2022-07-04 18:13:36           52
data.txt                                       2022-07-04 18:17:30        37538                                       2022-07-04 18:33:02         7064

Here, we can see that all the files present in the folder have been listed.

Inside ZipFile, the first argument we provided is the path of the file which is a string.

Then the second argument we provided is the mode. Reading mode is default whether you pass it or not it doesn't matter.

Then we called .printdir() on arch which holds the instance of ZipFile to print the table of contents in a user-friendly format

  • File Name
  • Modified
  • Size

Error Handling by using Try & Except

We are going to see how zipfile handles the exceptions through BadZipFile class that provides an easily readable error.

# Provided valid zip file
    with zipfile.ZipFile("") as arch:
except zipfile.BadZipFile as error:

File Name                                             Modified             Size
document.txt                                   2022-07-04 18:13:36           52
data.txt                                       2022-07-04 18:17:30        37538                                       2022-07-04 18:33:02         7064

# Provided bad zip file
    with zipfile.ZipFile("") as arch:
except zipfile.BadZipFile as error:

File is not a zip file

The first code block ran successfully and printed the contents of the zip file because the ZIP file we provided was a valid ZIP file, whereas the error was thrown when we provided a bad ZIP file.

We can check if a zip file is valid or not by using is_zipfile function.

# Example 1
valid = zipfile.is_zipfile("")


# Example 2
valid = zipfile.is_zipfile("")


Returns True if a file is a valid ZIP file otherwise returns False.

# Print content if a file is valid
if zipfile.is_zipfile(""):
    with zipfile.ZipFile("") as arch:

    print("This is not a valid ZIP format.")

File Name                                             Modified             Size
document.txt                                   2022-07-04 18:13:36           52
data.txt                                       2022-07-04 18:17:30        37538                                       2022-07-04 18:33:02         7064

if zipfile.is_zipfile(""):
    with zipfile.ZipFile("") as arch:

    print("This is not a valid ZIP format file.")

This is not a valid ZIP format file.

Writing the ZIP file

To open a ZIP file for writing, use write mode w.

If the file you are trying to write exists, then w will truncate the existing file and writes a new content that you've passed in.

import zipfile

# Adding multiple files
with zipfile.ZipFile('', 'w') as myzip:


File Name                                             Modified             Size
geek.txt                                       2022-07-05 14:52:01           85

geek.txt will be added to the which is created just now after running the code.

Adding multiple files

import zipfile

# Adding multiple files
with zipfile.ZipFile('', 'w') as myzip:


File Name                                             Modified             Size
geek.txt                                       2022-07-05 14:52:01           85                                     2022-07-05 14:52:01          136

Note: The file you are giving as an argument to .write should exist.

If you pass a file that don't exist or trying to create directory that doesn't exist then it will throw a FileNotFoundError.

import zipfile

# Passing a non-existing directory
with zipfile.ZipFile('hello/', 'w') as myzip:

FileNotFoundError: [Errno 2] No such file or directory: 'hello/'

import zipfile

# Passing a non-existing file
with zipfile.ZipFile('', 'w') as myzip:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'hello.txt'

Appending files to the existing ZIP archive

To append the files into an existing ZIP archive use append mode a.

import zipfile

# Appending files to the existing zip file
with zipfile.ZipFile('', 'a') as myzip:


File Name                                             Modified             Size
geek.txt                                       2022-07-05 14:52:00           85
index.html                                     2022-07-05 15:32:35          176                                     2022-07-05 14:52:01          136

Reading Metadata

There are some methods that help us to read the metadata of ZIP archives.

  • .getinfo(filename): It returns a ZipInfo object that holds information about member file provided by filename.
  • .infolist(): Return a list containing a ZipInfo object for each member of the archive.
  • .namelist(): Return a list of archive members by name.

There is another function which is .printdir() that we already used.

import zipfile

with zipfile.ZipFile("", mode="r") as arch:
    myzip = arch.getinfo("geek.txt")

>>> geek.txt

>>> (2022, 7, 5, 14, 52, 0)

>>> 85

>>> 85

Extracting information about the files in a specified archive using .infolist()

import zipfile
import datetime

date = datetime.datetime

with zipfile.ZipFile("", "r") as info:
    for arch in info.infolist():
        print(f"The file name is: {arch.filename}")
        print(f"The file size is: {arch.file_size} bytes")
        print(f"The compressed size is: {arch.compress_size} bytes")
        print(f"Date of creation: {date(*arch.date_time)}")

        print("-" * 15)

The file name is: geek.txt
The file size is: 85 bytes
The compressed size is: 85 bytes
Date of creation: 2022-07-05 14:52:00
The file name is: index.html
The file size is: 176 bytes
The compressed size is: 176 bytes
Date of creation: 2022-07-05 15:32:34
The file name is:
The file size is: 136 bytes
The compressed size is: 136 bytes
Date of creation: 2022-07-05 14:52:00

Let's see some more methods

import zipfile
with zipfile.ZipFile("", "r") as info:
    for arch in info.infolist():
        if arch.create_system == 0:
            system = "Windows"
        elif arch.create_system == 3:
            system = "UNIX"
            system = "Unknown"

        print(f"ZIP version: {arch.create_version}")
        print(f"Create System: {system}")
        print(f"External Attributes: {arch.external_attr}")
        print(f"Internal Attributes: {arch.internal_attr}")
        print(f"Comments: {arch.comment}")

        print("-" * 15)

ZIP version: 20
Create System: Windows
External Attributes: 32
Internal Attributes: 1
Comments: b''
ZIP version: 20
Create System: Windows
External Attributes: 32
Internal Attributes: 1
Comments: b''
ZIP version: 20
Create System: Windows
External Attributes: 32
Internal Attributes: 1
Comments: b''

.create_system returned an integer

  • 0 - for Windows
  • 3 - for Unix

Example for showing the use of .namelist()

import zipfile

with zipfile.ZipFile("", "r") as files:
    for files_list in files.namelist():


Reading and Writing Member files

Member files are referred to as those files which are present inside the ZIP archives.

To read the content of the member file without extracting it, then we use .read(). It takes name which is the name of the file in an archive and pwd is the password used for the encrypted files.

import zipfile

with zipfile.ZipFile("", "r") as zif:
    for lines in"intro.txt").split(b"\r\n"):

b'Hey, Welcome to GeekPython!'
b'Are you enjoying it?'
b"Now it's time, see you later!"

We've added .split() to print the stream of bytes into lines by using the separator /r/n and added b as a suffix because we are working on the byte object.

Other than .read(), we can use .open() which allows us to read, write and add a new file in a flexible way because just like open() function, it implements context manager protocol and therefore supports with statement.

import zipfile

with zipfile.ZipFile("", "r") as my_zip:
    with"document.txt", "r") as data:
        for text in data:

b'Hey, I a document file inside the folder.\r\n'
b'Are you enjoying it.'

We can use .open() with write mode w to create a new member file and write content to it, and then we can append it to the existing archive.

import zipfile

with zipfile.ZipFile("", "a") as my_zip:
    with"file.txt", "w") as data_file:
        data_file.write(b"Hi, I am a new file.")

with zipfile.ZipFile("", mode="r") as archive:
    print("-" * 20)
    for line in"file.txt").split(b"\n"):

File Name                                             Modified             Size
data.txt                                       2022-07-04 18:17:30        37538                                       2022-07-04 18:33:02         7064
document.txt                                   2022-07-06 17:08:36           76
file.txt                                       1980-01-01 00:00:00           20
b'Hi, I am a new file.'

Extracting the ZIP archive

There are 2 methods to extract ZIP archive

  • .extractall() - which allows us to extract all members of the archive in the current working directory. We can also specify the path of the directory of our choice.
import zipfile

with zipfile.ZipFile("", "r") as file:

All the member files will be extracted into the folder named files in your current working directory. You can specify another directory.

  • .extract() - allows us to extract a member from the archive to the current working directory. You must keep one thing in mind you need to specify the full name of the member or it must be a ZipInfo object.
import zipfile
with zipfile.ZipFile("", "r") as file:

hello.txt will be extracted from the archive to the current working directory. You can specify the output directory of your choice. You just need to specify path="output_directory/" as an argument inside the extract().

Creating ZIP files

Creating ZIP files is simply writing existing files.

# Creating archive using zipfile module

files = ["hello.txt", "", "python.txt"]

with zipfile.ZipFile("", "w") as archive:
    for file in files:

or you can simply add files by directly specifying the full name.

import zipfile

with zipfile.ZipFile("", "w") as archive:

Creating ZIP files using shutil

We can use shutil to make a ZIP archive and it provides an easy way of doing it.

import shutil

shutil.make_archive("archive", "zip", "files")

Here archive is the file name that will be created as a ZIP archive, zip is the extension that will be added to the file name, and files is a folder whose data will be archived.

Unpacking the ZIP archive using shutil

import shutil

shutil.unpack_archive("", "archive")

Here is the ZIP archive and archive is the name of file to be given after the extraction.

Compressing ZIP files

Usually, when we use zipfile to make a ZIP archive, the result we get is actually uncompressed because by default it uses ZIP_STORED compression method.

It's like member files are stored in a container that is archived.

So, we need to pass an argument compression inside ZipFile.

There are 3 types of constant to compress files:

  • zipfile.ZIP_DEFLATED - requires zlib module and compression method is deflate.

  • zipfile.ZIP_BZIP2 - requires bz2 module and compression method is BZIP2.

  • zipfile.ZIP_LZMA - requires lzma module and compression method is LZMA.

import zipfile
with zipfile.ZipFile("", "w", compression=zipfile.ZIP_DEFLATED) as archive:

import zipfile
with zipfile.ZipFile("", "w", compression=zipfile.ZIP_BZIP2) as archive:

We can also add compression level. We can give value between 0 to 9 for maximum compression.

import zipfile

with zipfile.ZipFile("", "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9) as archive:

Did you know that zipfile can run from command line?

Run zipfile from Command Line Interface

Here are some options which allow us to list, create, and extract ZIP archives from the command line.

-l or --list: List files in a zipfile.

python -m zipfile -l

File Name                                             Modified             Size
Streamlit-Apps-master/                         2022-06-30 23:31:36            0
Streamlit-Apps-master/Covid-19.csv             2022-06-30 23:31:36         1924
Streamlit-Apps-master/Covid-Banner.png         2022-06-30 23:31:36       538140
Streamlit-Apps-master/Procfile                 2022-06-30 23:31:36           40
Streamlit-Apps-master/                2022-06-30 23:31:36          901
Streamlit-Apps-master/WebAppPreview.png        2022-06-30 23:31:36       145818
Streamlit-Apps-master/                   2022-06-30 23:31:36         3162
Streamlit-Apps-master/requirements.txt         2022-06-30 23:31:36           46
Streamlit-Apps-master/                 2022-06-30 23:31:36          220

It just works like .printdir().

-c or --create: Create zipfile from source files.

python -m zipfile --create python.txt hello.txt

It will create a ZIP archive named and add the file names specified above.

Creating a ZIP file to archive entire directory

python -m zipfile --create source/

python -m zipfile -l

File Name                                             Modified             Size
source/                                        2022-07-07 17:22:42            0
source/archive/                                2022-07-07 10:58:06            0
source/archive/hello.txt                       2022-07-07 10:58:06           62
source/archive/index.html                      2022-07-07 10:58:06          176
source/archive/intro.txt                       2022-07-07 10:58:06           86
source/archive/                      2022-07-07 10:58:06          136
source/                                 2022-07-07 15:50:28           45
source/hello.txt                               2022-07-07 12:21:22           61

-e or --extract: Extract zipfile into target directory.

python -m zipfile --extract extracted/ will be extracted into extracted directory.

-t or --test: Test whether the zipfile is valid or not.

python -m zipfile --test

Traceback (most recent call last):
BadZipFile: File is not a zip file


Phew, that was such a long module to cover but still, I haven't covered all the topics.

But it is good enough to get started with zipfile module and manipulate ZIP archives without even extracting them.

ZIP files do have some benefits like they save disk storage and faster transfer speed over a network and more.

We sure learned some useful operations that we can perform on ZIP archives using zipfile module like:

  • Read, write and extract the existing ZIP archives

  • Reading the metadata

  • Creating ZIP archives

  • Manipulating member files

  • Running zipfile from command line

That's all for now

Keep Coding✌✌


学AI,认准AI Studio!GPU算力,限时免费领,邀请好友解锁更多惊喜福利 >>>
