2026-01-15

Experiment 15: DBOS Payload Size Performance Analysis

Purpose

This experiment measures the performance impact of different payload sizes in DBOS steps and compares two approaches:

Multiple small step calls - Calling a step multiple times with incrementally larger payloads
Single batched step call - Calling a step once that processes all payloads internally

The goal is to understand the overhead of DBOS step serialization, deserialization, and database storage as payload sizes increase from 1 byte to 1 MB.

Experiment Design

Approach 1: Multiple Step Calls (`size_workflow`)

Calls the size_step() function 7 times with increasing payload sizes:

Iteration 1: 10^0 = 1 byte
Iteration 2: 10^1 = 10 bytes
Iteration 3: 10^2 = 100 bytes
Iteration 4: 10^3 = 1,000 bytes (1 KB)
Iteration 5: 10^4 = 10,000 bytes (10 KB)
Iteration 6: 10^5 = 100,000 bytes (100 KB)
Iteration 7: 10^6 = 1,000,000 bytes (1 MB)

Total payload: 1,111,111 bytes across 7 separate step calls

Approach 2: Single Batched Step (`batch_size_workflow`)

Calls batch_size_step() once, which internally generates all 7 payloads and returns them concatenated.

Total payload: 1,111,111 bytes in 1 step call

Key Observations

Performance Results

Workflow: Starting
Workflow: Iteration 1/7
Step: Size step with payload size 1 bytes
Workflow: Payload size is 1 bytes, took 55.41 ms
Workflow: Iteration 2/7
Step: Size step with payload size 10 bytes
Workflow: Payload size is 10 bytes, took 21.63 ms
Workflow: Iteration 3/7
Step: Size step with payload size 100 bytes
Workflow: Payload size is 100 bytes, took 21.71 ms
Workflow: Iteration 4/7
Step: Size step with payload size 1000 bytes
Workflow: Payload size is 1000 bytes, took 21.81 ms
Workflow: Iteration 5/7
Step: Size step with payload size 10000 bytes
Workflow: Payload size is 10000 bytes, took 23.87 ms
Workflow: Iteration 6/7
Step: Size step with payload size 100000 bytes
Workflow: Payload size is 100000 bytes, took 35.45 ms
Workflow: Iteration 7/7
Step: Size step with payload size 1000000 bytes
Workflow: Payload size is 1000000 bytes, took 120.46 ms
Workflow: Completed successfully in 300.35 ms
----------------------------------------------------
Workflow: Starting
Step: Batch size step iteration 1/7
Step: Batch size step iteration 2/7
Step: Batch size step iteration 3/7
Step: Batch size step iteration 4/7
Step: Batch size step iteration 5/7
Step: Batch size step iteration 6/7
Step: Batch size step iteration 7/7
Workflow: Payload size is 1111111 bytes, took 157.82 ms
Main: Workflow output: True

Analysis

Approach	Total Time	Number of Steps	Overhead per Step
Multiple small steps	300.35 ms	7	~42.9 ms average
Single batched step	157.82 ms	1	N/A

Key Findings:

First step overhead: The first step call takes ~55ms, likely due to initialization overhead
Small payload consistency: Steps with payloads 1-1000 bytes take ~21-24ms consistently
Scaling behavior: Performance degrades as payload size increases:
- 100 KB: 35.45 ms
- 1 MB: 120.46 ms
Batching advantage: Single batched step is ~47% faster (157ms vs 300ms)
- Eliminates 6 step serialization/deserialization cycles
- Reduces database writes from 7 to 1
- Avoids repeated DBOS framework overhead

Performance Breakdown

Step Size       | Time (ms) | Delta from Previous
----------------|-----------|--------------------
byte          | 55.41     | baseline (includes init)
bytes        | 21.63     | -33.78 ms (steady state)
bytes       | 21.71     | +0.08 ms
KB            | 21.81     | +0.10 ms
KB           | 23.87     | +2.06 ms
KB          | 35.45     | +11.58 ms
MB            | 120.46    | +85.01 ms (non-linear growth)

Performance Implications

When to Use Multiple Steps

Better granularity: Individual step recovery and retry
Better observability: Track progress of each payload size
Memory efficiency: Process data incrementally
Acceptable for small payloads (< 10 KB): Overhead is minimal (~21-24ms per step)

When to Use Batched Steps

Large payloads: Reduces serialization overhead significantly
High throughput requirements: 47% faster for same total data
Atomic operations: All-or-nothing processing
Simple workflows: When granular recovery isn’t needed

Code Structure

`size_step(payload_size: int) -> bytes`

Generates random bytes of size 10^payload_size
Returns the payload
Logs the payload size

`size_workflow() -> bool`

Calls size_step() 7 times with increasing sizes
Measures and logs time for each step call
Validates payload sizes
Returns total execution time

`batch_size_step(iterations: int) -> bytes`

Generates all 7 payloads internally
Concatenates them into a single return value
Logs progress for each iteration

`batch_size_workflow() -> bool`

Calls batch_size_step() once with all iterations
Measures total time
Returns execution time

Usage

# Run the experiment
python exp15/ex1.py

Prerequisites

PostgreSQL database running on localhost:5432
Database: test with user trustle:trustle
Python dependencies: pip install dbos

Environment Variables

export DBOS_DATABASE_URL="postgresql://trustle:trustle@localhost:5432/test?sslmode=disable"

Learning Points

DBOS step overhead: ~21-24ms baseline per step for small payloads
Serialization cost: Grows non-linearly with payload size
Database I/O impact: Each step write adds overhead
Batching benefits: Significant performance gain for high-volume workflows
Design trade-offs: Granularity vs. performance
Scaling behavior: 1 MB payload takes 5.5x longer than 100 KB
First call penalty: Initial step has 2.5x overhead (~55ms vs ~21ms)

Recommendations

Small data (< 10 KB): Use multiple steps for better observability
Medium data (10-100 KB): Balance between granularity and performance
Large data (> 100 KB): Consider batching or streaming approaches
High-throughput: Batch processing can save ~47% execution time
Critical workflows: Multiple steps provide better recovery granularity

Future Experiments

Potential areas to explore:

Compression impact on payload serialization
Streaming large payloads through multiple steps
Impact of concurrent step execution
Database storage costs for large payloads
Memory usage patterns for different approaches

Recent changes

2025-10-18 66833e7 added readmes
2025-10-14 247bfc9 new approach for the DB
2025-10-14 f872897 test dbos step payload size

Categories: experiments, Python

Tags: dbos-experiments

← Previous · Next →

DBOS Experiments: Experiment 15: DBOS Payload Size Performance Analysis