How to Delete Files Inside an S3 Folder Using Boto3 (Without Removing the Folder)

Amazon S3 (Simple Storage Service) is a popular object storage service used for storing and retrieving files (objects) at scale. Unlike traditional file systems, S3 uses a flat structure where there are no "real" folders or directories—instead, objects are organized using prefixes in their keys (e.g., my-folder/file.txt has the prefix my-folder/).

A common task for S3 users is to delete all files within a specific "folder" (prefix) without accidentally removing the folder itself (or other files outside the folder). For example, you might want to clean up old logs in logs/2023/ but keep the logs/ structure intact.

In this blog, we’ll walk through how to achieve this using Boto3—the official AWS SDK for Python. We’ll cover listing objects in the target folder, deleting them in batches, handling large datasets, and verifying the deletion. By the end, you’ll have a clear, repeatable process to clean up files in S3 folders.

Table of Contents#

  1. Prerequisites
  2. Understanding S3 "Folders": Prefixes vs. Directories
  3. Setting Up Boto3
  4. Step 1: List Objects in the Target S3 Folder
  5. Step 2: Delete Files in the Folder
  6. Handling Large Numbers of Files (Over 1000 Objects)
  7. Verifying Deletion
  8. Conclusion
  9. References

Prerequisites#

Before getting started, ensure you have the following:

  • AWS Account: You need an AWS account with access to S3.
  • AWS Credentials: Configure AWS credentials on your machine (via ~/.aws/credentials, environment variables, or IAM roles if running on AWS services like EC2/EKS).
  • Python & Boto3: Install Python (3.6+) and Boto3. Run pip install boto3 to install Boto3.
  • IAM Permissions: The IAM user/role needs:
    • s3:ListBucket permission on the target bucket (to list objects).
    • s3:DeleteObject permission on the objects (or bucket) to delete files.

Understanding S3 "Folders": Prefixes vs. Directories#

S3 does not have traditional directories. What appears as a "folder" in the AWS Console or S3 clients (e.g., my-folder/) is actually a prefix in the object key. For example:

  • An object with key my-folder/report.pdf is logically inside the "folder" my-folder/.
  • The "folder" itself is not a tangible entity—it’s just a visual representation of the prefix.

Thus, "deleting a folder" in S3 is not necessary. Instead, we delete all objects with the target prefix (e.g., my-folder/). The "folder" will no longer appear in the console once all its objects are deleted (unless a placeholder object like my-folder/ exists—more on this later).

Setting Up Boto3#

First, initialize an S3 client using Boto3. This client will interact with S3 to list and delete objects.

Step 1: Install Boto3#

If you haven’t already, install Boto3:

pip install boto3  

Step 2: Initialize the S3 Client#

Use Boto3 to create an S3 client. Boto3 automatically reads credentials from ~/.aws/credentials, environment variables, or IAM roles.

import boto3  
 
# Initialize S3 client  
s3_client = boto3.client('s3')  

Step 1: List Objects in the Target S3 Folder#

To delete files in a folder, we first need to list all objects with the target prefix. Use the list_objects_v2 API for this (it’s the latest version of the list objects API).

Key Parameters for list_objects_v2:#

  • Bucket: Name of your S3 bucket (e.g., my-bucket).
  • Prefix: The folder prefix (e.g., my-folder/—include the trailing slash to avoid matching keys like my-folder-report.pdf).
  • MaxKeys: Optional (default: 1000). Limits the number of objects returned per request.

Example: List Objects in my-folder/#

bucket_name = "my-bucket"  
folder_prefix = "my-folder/"  # Trailing slash ensures we target only objects IN the folder  
 
# List objects in the folder  
response = s3_client.list_objects_v2(  
    Bucket=bucket_name,  
    Prefix=folder_prefix  
)  
 
# Extract objects (if any)  
objects = response.get("Contents", [])  # "Contents" is empty if no objects exist  
 
if not objects:  
    print(f"No objects found in {folder_prefix}")  
else:  
    print(f"Found {len(objects)} objects in {folder_prefix}:")  
    for obj in objects:  
        print(f"- {obj['Key']}")  # obj['Key'] is the full object key (e.g., "my-folder/file1.txt")  

Handling Pagination#

By default, list_objects_v2 returns up to 1000 objects per request. If there are more than 1000 objects, use ContinuationToken to paginate through results (we’ll cover this later for large datasets).

Step 2: Delete Files in the Folder#

Once you’ve listed the objects, use the delete_objects API to delete them. This API allows deleting up to 1000 objects in a single request (more efficient than deleting one object at a time).

Option 1: Delete a Single Object#

If you only need to delete one file (e.g., my-folder/old_log.txt), use delete_object:

object_key = "my-folder/old_log.txt"  
 
s3_client.delete_object(  
    Bucket=bucket_name,  
    Key=object_key  
)  
 
print(f"Deleted object: {object_key}")  

Option 2: Delete Multiple Objects (Batch Deletion)#

To delete multiple objects, pass a list of object keys to delete_objects.

Example: Delete All Objects in my-folder/#

bucket_name = "my-bucket"  
folder_prefix = "my-folder/"  
 
# Step 1: List objects in the folder  
response = s3_client.list_objects_v2(  
    Bucket=bucket_name,  
    Prefix=folder_prefix  
)  
objects = response.get("Contents", [])  
 
if not objects:  
    print("No objects to delete.")  
else:  
    # Step 2: Prepare list of object keys to delete  
    objects_to_delete = [{"Key": obj["Key"]} for obj in objects]  
 
    # Step 3: Delete objects in batch  
    delete_response = s3_client.delete_objects(  
        Bucket=bucket_name,  
        Delete={  
            "Objects": objects_to_delete,  
            "Quiet": False  # Set to True to suppress successful deletion details  
        }  
    )  
 
    # Check for successful deletions and errors  
    if "Deleted" in delete_response:  
        print(f"Successfully deleted {len(delete_response['Deleted'])} objects:")  
        for deleted in delete_response["Deleted"]:  
            print(f"- {deleted['Key']}")  
 
    if "Errors" in delete_response:  
        print(f"Errors deleting {len(delete_response['Errors'])} objects:")  
        for error in delete_response["Errors"]:  
            print(f"- {error['Key']}: {error['Message']}")  

Handling Large Numbers of Files (Over 1000 Objects)#

If the folder contains more than 1000 objects, list_objects_v2 will return only the first 1000. To delete all objects, you need to:

  1. Paginate through the list of objects (using ContinuationToken).
  2. Batch delete in chunks of up to 1000 objects per request.

Example: Delete All Objects in a Large Folder#

Here’s a script that handles pagination and batch deletion:

import boto3  
 
def delete_files_in_s3_folder(bucket_name, folder_prefix):  
    s3_client = boto3.client('s3')  
    continuation_token = None  
 
    while True:  
        # List objects with pagination  
        list_kwargs = {  
            "Bucket": bucket_name,  
            "Prefix": folder_prefix  
        }  
        if continuation_token:  
            list_kwargs["ContinuationToken"] = continuation_token  
 
        response = s3_client.list_objects_v2(**list_kwargs)  
        objects = response.get("Contents", [])  
 
        if not objects:  
            print("No more objects to delete.")  
            break  
 
        # Delete objects in batches of up to 1000  
        objects_to_delete = [{"Key": obj["Key"]} for obj in objects]  
        delete_response = s3_client.delete_objects(  
            Bucket=bucket_name,  
            Delete={"Objects": objects_to_delete, "Quiet": False}  
        )  
 
        # Print results  
        if "Deleted" in delete_response:  
            print(f"Deleted {len(delete_response['Deleted'])} objects.")  
        if "Errors" in delete_response:  
            print(f"Errors: {delete_response['Errors']}")  
 
        # Check if there are more objects to list  
        if not response.get("IsTruncated"):  
            break  # No more pages  
        continuation_token = response["NextContinuationToken"]  
 
 
# Usage  
bucket_name = "my-bucket"  
folder_prefix = "my-folder/"  # Target folder  
delete_files_in_s3_folder(bucket_name, folder_prefix)  

Verifying Deletion#

After deletion, verify that all objects in the folder are gone:

Option 1: List Objects Again#

Re-run the list_objects_v2 command to check for remaining objects:

response = s3_client.list_objects_v2(  
    Bucket=bucket_name,  
    Prefix=folder_prefix  
)  
remaining_objects = response.get("Contents", [])  
 
if remaining_objects:  
    print(f"Warning: {len(remaining_objects)} objects remain:")  
    for obj in remaining_objects:  
        print(f"- {obj['Key']}")  
else:  
    print("All objects in the folder have been deleted.")  

Option 2: Check delete_objects Response#

The delete_response from delete_objects includes a Deleted list (successful deletions) and an Errors list (failed deletions). Use this to confirm success:

if "Errors" in delete_response:  
    print(f"Some deletions failed. Check errors: {delete_response['Errors']}")  
else:  
    print("All deletions succeeded!")  

Note: Placeholder Objects#

If the "folder" still appears in the AWS Console after deletion, it may be due to a placeholder object (e.g., an empty object with key my-folder/). This is sometimes created manually to force the folder to appear. To remove the placeholder, delete it explicitly:

s3_client.delete_object(Bucket=bucket_name, Key=folder_prefix)  # Deletes the placeholder  

Conclusion#

Deleting files inside an S3 "folder" using Boto3 is straightforward once you understand that S3 uses prefixes instead of directories. The key steps are:

  1. List objects with the target prefix (folder name).
  2. Delete the objects in batches (up to 1000 per request).
  3. Handle pagination for large datasets.

This approach ensures you clean up files without affecting other parts of your bucket. Remember: S3 "folders" are just prefixes, so deleting all objects with a prefix effectively "empties" the folder.

References#