How to persist data on RunPod
Quick answer
To persist data on RunPod, use persistent storage options like mounting external volumes or saving data to cloud storage services (e.g., S3) within your serverless function. You can also write files to the ephemeral storage during runtime and upload them to durable storage before the function ends.
PREREQUISITES
Python 3.8+RunPod account and API keypip install runpod
Setup
Install the runpod Python package and set your API key as an environment variable for authentication.
pip install runpod output
Collecting runpod Downloading runpod-1.0.0-py3-none-any.whl (15 kB) Installing collected packages: runpod Successfully installed runpod-1.0.0
Step by step
This example demonstrates how to persist data by saving a file locally during the job execution and then uploading it to an external cloud storage (e.g., AWS S3) before the job finishes.
import os
import runpod
import boto3
# Set your RunPod API key in environment variables
runpod.api_key = os.environ["RUNPOD_API_KEY"]
# Initialize S3 client (ensure AWS credentials are set in env vars or config)
s3 = boto3.client('s3')
# Define your RunPod serverless handler
def handler(job):
# Write data to a local file in ephemeral storage
local_path = "/tmp/output.txt"
with open(local_path, "w") as f:
f.write("Persisted data from RunPod job")
# Upload the file to S3 bucket for persistence
bucket_name = "your-s3-bucket"
s3_key = "runpod/output.txt"
s3.upload_file(local_path, bucket_name, s3_key)
return {"result": f"File uploaded to s3://{bucket_name}/{s3_key}"}
# Start the serverless function
runpod.serverless.start({"handler": handler}) output
INFO: Starting RunPod serverless function... INFO: Job received INFO: File uploaded to s3://your-s3-bucket/runpod/output.txt INFO: Job completed successfully
Common variations
You can persist data by integrating other cloud storage providers like Google Cloud Storage or Azure Blob Storage instead of AWS S3. Alternatively, mount persistent volumes if your RunPod environment supports it. For asynchronous jobs, ensure uploads complete before the function exits.
Troubleshooting
- If you encounter permission errors uploading to cloud storage, verify your cloud credentials and bucket policies.
- If files are missing after job completion, confirm that uploads finish before the function terminates.
- Check that
/tmpor ephemeral storage paths are writable in your RunPod environment.
Key Takeaways
- Use ephemeral storage during runtime and upload files to durable cloud storage for persistence on RunPod.
- Set API keys and cloud credentials securely via environment variables.
- Ensure uploads complete before the serverless function exits to avoid data loss.