Workshop 3 Preview: Advanced Networking (Exercises 6-8)

Part 3 of 3: Wireless Tuning, Congestion Handling, and NAT Traversal

ros2
zenoh
workshop
roscon-india
networking
performance
Author

Rajesh

Published

December 17, 2025

Series Overview

This is Part 3 (final) of the Workshop 3 exercise preview:

Part Exercises Focus
Part 1 1-3 Fundamentals: Pub/Sub, QoS, Shared Memory
Part 2 4-5 Remote: Cloud Router, mTLS Security
Part 3 (This Post) 6-8 Advanced: Wireless, Congestion, NAT

Exercise 6: Wireless Performance Tuning

The WiFi Challenge

WiFi introduces challenges that wired Ethernet doesn’t have:

Challenge Impact Mitigation
Variable latency 5-100ms jitter Increase buffers
Packet loss 1-5% typical Enable reliability
Half-duplex Can’t TX and RX simultaneously Reduce bandwidth
Interference Other devices, microwaves Choose channels wisely

Bandwidth Reality Check

Theoretical vs Real WiFi throughput:

WiFi Standard Theoretical Real-World With Interference
802.11n (WiFi 4) 300 Mbps 100-150 Mbps 50-80 Mbps
802.11ac (WiFi 5) 867 Mbps 300-500 Mbps 150-250 Mbps
802.11ax (WiFi 6) 1200 Mbps 500-700 Mbps 300-400 Mbps

Robot sensor bandwidth:

Sensor Bandwidth
720p Camera @ 30 FPS ~27 MB/s (216 Mbps)
Depth Camera @ 30 FPS ~18 MB/s (144 Mbps)
LIDAR @ 20 Hz ~2 MB/s (16 Mbps)
Total ~47 MB/s (376 Mbps)
Figure 1: Robot sensor bandwidth often exceeds real-world WiFi capacity
The Math Doesn’t Work

A Go2 robot publishing full sensor data (~47 MB/s) exceeds typical WiFi 5 real-world throughput. Without optimization, you’ll experience: - Packet loss - Latency spikes - Dropped connections

Linux Kernel Tuning

# Increase network buffer sizes
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
sudo sysctl -w net.core.rmem_default=1048576
sudo sysctl -w net.core.wmem_default=1048576

# Use BBR congestion control (better for WiFi)
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Increase backlog for burst handling
sudo sysctl -w net.core.netdev_max_backlog=10000

Topic Filtering: The Biggest Win

Instead of sending everything, filter at the source:

# Before: All 47 topics bridged = 50 MB/s
zenoh-bridge-ros2dds

# After: Only essential topics = 100 KB/s
zenoh-bridge-ros2dds \
    --allow "/cmd_vel" \
    --allow "/odom" \
    --allow "/battery_state" \
    --deny ".*"
Configuration Bandwidth Reduction
All topics 50 MB/s -
Navigation only 3 MB/s 94%
Telemetry only 100 KB/s 99.8%

Hands-On Testing: Simulating WiFi Congestion

Figure 2: Using Linux Traffic Control to simulate WiFi congestion

We tested with a RealSense D435i camera (720p @ 30 FPS) using Linux Traffic Control to simulate WiFi congestion:

# Baseline (no limits)
ros2 topic bw /camera/camera/color/image_raw
# Result: 84-86 MB/s, 30 FPS, 2.76 MB per frame

# Simulate congested WiFi (200 Mbps + 10ms jitter)
tc qdisc add dev lo root netem delay 10ms 5ms rate 200mbit

# Measure again
ros2 topic hz /camera/camera/color/image_raw
# Result: ~5 FPS (83% drop!)
Eureka Moment #1: 83% FPS Drop!

With 200 Mbps bandwidth limit, our 720p camera dropped from 30 FPS to just 5 FPS. Why? The camera produces 84 MB/s (672 Mbps), but we only had 200 Mbps available!

The Compression Solution

The D435i publishes multiple formats via image_transport. Here’s what we measured:

Topic Size FPS (Congested) Compression
/image_raw 2.76 MB ~5 FPS 1x (baseline)
/image_raw/zstd 1.45 MB ~12 FPS 1.9x
/image_raw/compressed 0.23 MB ~30 FPS 12x smaller!
/depth/image_rect_raw 0.81 MB ~30 FPS N/A
Figure 3: Compression dramatically reduces bandwidth while maintaining frame rate
Eureka Moment #2: JPEG Compression is the WiFi Solution!

The JPEG compressed topic (0.23 MB) achieves full 30 FPS on the same congested network where raw (2.76 MB) only gets 5 FPS. That’s a 12x size reduction with no FPS loss!

Who Does the Compression?

The image_transport ROS 2 package does it automatically! When you launch the camera, these topics appear:

  • /image_raw - Original uncompressed
  • /image_raw/compressed - JPEG compressed
  • /image_raw/zstd - ZSTD compressed
  • /image_raw/theora - Video codec

No extra configuration needed - just subscribe to /compressed instead of /image_raw.

Traffic Control Quick Reference

# Add bandwidth limit with delay
tc qdisc add dev lo root netem delay 10ms 5ms rate 200mbit

# Remove all tc rules
tc qdisc del dev lo root

# Show current rules
tc qdisc show dev lo

Exercise 7: Congestion Handling

Head-of-Line Blocking

When a network is congested, messages queue up. If one stream blocks, all streams may be affected:

Figure 4: Head-of-line blocking: large camera frames block tiny control commands
┌─────────────────────────────────────────────────────────────────────────┐
│                    HEAD-OF-LINE BLOCKING                                │
│                                                                          │
│    Publishers              Queue              Network                   │
│                                                                          │
│    /camera ─────►  ┌─────────────┐                                      │
│    (27 MB/s)       │ ███████████ │───slow────►  WiFi                    │
│                    │ ███████████ │  (blocked)  (congested)              │
│    /cmd_vel ────►  │ ░░░░░░░░░░░ │                                      │
│    (tiny)          └─────────────┘                                      │
│                          ▲                                               │
│                          │                                               │
│    /cmd_vel is BLOCKED behind /camera data!                            │
│    Control commands delayed = dangerous for robot!                      │
└─────────────────────────────────────────────────────────────────────────┘

Priority Queues

Zenoh supports priority levels to ensure critical messages aren’t blocked:

Priority Use Case Example Topics
Real-time Emergency, safety /emergency_stop
Interactive Control commands /cmd_vel, /arm_control
Data Sensor streams /camera, /lidar
Background Logs, diagnostics /rosout, /diagnostics

Zenoh Priority Configuration

// zenoh_priority.json5
{
  // Define priority mappings
  qos: {
    publication: {
      // High priority for control
      "rt/cmd_vel": { priority: "real_time" },
      "rt/emergency_stop": { priority: "real_time" },

      // Normal priority for sensors
      "rt/camera/**": { priority: "data" },
      "rt/lidar/**": { priority: "data" },

      // Low priority for logs
      "rt/rosout": { priority: "background" }
    }
  }
}

Official Zenoh Priority Levels

📋 From ROSCon India 2025 Workshop (ZettaScale)

Zenoh supports these priority levels for controlling delivery and processing order.

Priority Constant Value Use Case
Z_PRIORITY_REAL_TIME 1 Emergency stop, safety-critical
Z_PRIORITY_INTERACTIVE_HIGH 2 Control commands (/cmd_vel)
Z_PRIORITY_INTERACTIVE_LOW 3 User interactions
Z_PRIORITY_DATA_HIGH 4 High-priority sensor data
Z_PRIORITY_DATA 5 Normal sensor streams
Z_PRIORITY_DATA_LOW 6 Low-priority telemetry
Z_PRIORITY_BACKGROUND 7 Logs, diagnostics

Per-Topic Priority in Zenoh Session Config:

// zenoh_session_priority.json5
{
  mode: "client",
  connect: { endpoints: ["tcp/127.0.0.1:7447"] },

  // Per-topic QoS configuration
  qos: {
    publication: {
      // Real-time priority for control commands
      "rt/cmd_vel": {
        priority: 2,      // Z_PRIORITY_INTERACTIVE_HIGH
        express: true     // Bypass batching for latency-critical!
      },
      "rt/emergency_stop": {
        priority: 1,      // Z_PRIORITY_REAL_TIME
        express: true
      },
      // Data priority for sensors
      "rt/camera/**": { priority: 5 },
      "rt/scan": { priority: 5 }
    }
  }
}
express Mode: Bypass Batching!

The express: true setting tells Zenoh to send this message immediately without waiting to batch it with other messages. This reduces latency for critical messages at the cost of slightly less efficient network usage.

Use express: true for: - /cmd_vel - Control commands - /emergency_stop - Safety critical - Heartbeat/liveness messages

Flow Control Strategies

Strategy Description When to Use
Drop oldest Discard old messages when queue full Camera streams
Drop newest Discard new messages when queue full Rarely useful
Block sender Pause publisher until queue has space Reliable data
Sample Only keep every Nth message High-rate sensors

Downsampling: Per-Topic Frequency Control

📋 From ROSCon India 2025 Workshop (ZettaScale)

Zenoh routers can automatically downsample topics when sending to bandwidth-constrained links.

When to Use Downsampling:

  • Crossing from wired LAN to wireless (WiFi, GSM, LTE)
  • Sending to cloud with limited bandwidth
  • Edge computing with limited processing capacity
  • Fleet monitoring where full rate isn’t needed

Zenoh Router Downsampling Configuration:

// router_downsample.json5
{
  mode: "router",
  listen: { endpoints: ["tcp/0.0.0.0:7447"] },

  // Downsampling rules for different interfaces
  downsampling: [
    {
      // For wireless interface (low bandwidth)
      interfaces: ["tcp/0.0.0.0:7448"],  // Wireless client port
      rules: [
        {
          key_exprs: ["rt/camera/**"],
          freq: 5  // 30 Hz → 5 Hz
        },
        {
          key_exprs: ["rt/scan"],
          freq: 2  // 10 Hz → 2 Hz
        },
        {
          key_exprs: ["rt/odom"],
          freq: 10  // Keep at 10 Hz (control needs it)
        }
      ]
    }
  ]
}

Comparison: Downsampling vs Compression vs Filtering:

Technique Bandwidth Reduction Quality Loss Use Case
Filtering (–allow/–deny) 100% (blocked) Complete Topics not needed remotely
Downsampling (freq) 50-90% Temporal Lower frequency acceptable
Compression (JPEG/ZSTD) 80-95% Minor Images, point clouds
All combined 99%+ Varies WiFi/cellular deployments
Downsampling Strategy
  1. Filter first - Don’t send topics you don’t need (--deny)
  2. Compress images - Use /compressed instead of /image_raw
  3. Downsample remaining - Reduce frequency for non-critical topics
  4. Keep control at full rate - Never downsample /cmd_vel!

Congestion Control: Per-Topic Dropping Policy

📋 From ROSCon India 2025 Workshop (ZettaScale)

System load is highest at startup - important messages can be dropped. Configure per-topic policies!

The Startup Problem:

At robot startup, multiple high-bandwidth topics start publishing simultaneously: - PointCloud (10-50 MB/message) - OccupancyGrid (1-10 MB/message) - Camera images (1-3 MB/message)

This burst can overwhelm the network, causing important messages to be dropped!

ROS 2 QoS + Zenoh Interaction:

ROS 2 QoS Zenoh Behavior Effect
RELIABLE + KEEP_ALL Forces reliable channel, always blocks Publisher waits for ACK
RELIABLE + KEEP_LAST(n) Reliable with limited queue Old messages dropped
BEST_EFFORT No retries, fast Messages may be lost

Per-Topic Congestion Policy Configuration:

// zenoh_congestion.json5
{
  mode: "client",
  connect: { endpoints: ["tcp/127.0.0.1:7447"] },

  qos: {
    publication: {
      // Camera: Best effort, drop oldest (it's temporal data)
      "rt/camera/**": {
        congestion_control: "drop",
        priority: 5
      },

      // PointCloud: Allow dropping to prevent blocking
      "rt/points": {
        congestion_control: "drop",
        priority: 5
      },

      // Control commands: Block until delivered (critical!)
      "rt/cmd_vel": {
        congestion_control: "block",
        priority: 2,
        express: true
      },

      // OccupancyGrid: Drop oldest, keep latest
      "rt/map": {
        congestion_control: "drop",
        priority: 6
      }
    }
  }
}

Congestion Control Options:

congestion_control Behavior Best For
"block" Publisher waits until space available Control commands, must-deliver
"drop" New messages dropped when queue full Sensor streams, temporal data
Critical: Never Block Camera Publishers!

If camera publishers use congestion_control: "block", a slow subscriber can freeze the entire camera pipeline. Always use "drop" for high-bandwidth sensor data.

Hands-On Testing: cmd_vel Under Congestion

We tested with camera (large images) and cmd_vel (tiny control commands) running simultaneously:

# Start cmd_vel publisher
ros2 topic pub /cmd_vel geometry_msgs/msg/Twist \
    "{linear: {x: 0.5}, angular: {z: 0.1}}" --rate 10

Baseline (No Congestion):

Topic Rate Jitter (std dev)
/cmd_vel 10 Hz 0.0003s
/camera/image_raw 30 FPS 0.0005s

Moderate Congestion (200 Mbps + 10ms delay):

Topic Rate Jitter (std dev)
/cmd_vel 10 Hz 0.0035s (10x worse)
/camera/image_raw ~5 FPS 0.04s

Severe Congestion (50 Mbps + 50ms delay):

Topic Rate Jitter (std dev) Interval Range
/cmd_vel 10 Hz 0.022s 62-135ms (expected: 100ms)
Figure 5: Small control messages slip through congestion but with increased jitter
Eureka Moment #5: Zenoh Handles Small Packets Well!

Surprising finding: cmd_vel (tiny messages) maintained 10 Hz rate even under severe congestion! Unlike classic “head-of-line blocking”, small messages slip through in Zenoh.

BUT timing consistency degrades significantly: - Jitter increased 73x (0.0003s → 0.022s) - Interval variance: ±37ms instead of ±1ms

Impact on Robots: - Jerky movement from inconsistent timing - Control loop instability - Path following errors

Lesson: Priority queues help ensure consistent timing, not just delivery. For safety-critical systems, predictable timing matters as much as throughput.


Exercise 8: NAT Traversal & Namespace Resolution

NAT Traversal Deep Dive

While Exercise 4 introduced cloud routers, Exercise 8 explores direct peer-to-peer connections through NAT.

┌─────────────────────────────────────────────────────────────────────────┐
│                    NAT TYPES                                             │
│                                                                          │
│    Full Cone NAT          Symmetric NAT          Port Restricted        │
│    ┌──────────┐           ┌──────────┐          ┌──────────┐           │
│    │ Any host │           │ Specific │          │ Specific │           │
│    │ can reach│           │host:port │          │ port     │           │
│    │ mapped   │           │ required │          │ required │           │
│    │ port     │           │          │          │          │           │
│    └──────────┘           └──────────┘          └──────────┘           │
│    Easy to traverse       Hard to traverse      Medium difficulty      │
└─────────────────────────────────────────────────────────────────────────┘

STUN/TURN Concepts

Protocol Purpose How It Works
STUN Discover public IP/port Client asks STUN server “what’s my public address?”
TURN Relay when direct fails All traffic goes through TURN server
ICE Try all methods STUN first, fall back to TURN
Figure 6: ICE connection establishment: STUN for discovery, TURN as fallback relay
┌─────────────────────────────────────────────────────────────────────────┐
│                    ICE CONNECTION ESTABLISHMENT                         │
│                                                                          │
│    1. Both peers query STUN server for public address                   │
│    2. Exchange addresses through signaling (Zenoh router)               │
│    3. Try direct connection (STUN-discovered addresses)                 │
│    4. If direct fails, use TURN relay                                   │
│                                                                          │
│    Robot ◄────── STUN ──────► Server ◄────── STUN ──────► Laptop       │
│       │                                                      │          │
│       └──────────────── Direct P2P (if possible) ───────────┘          │
│                              or                                         │
│       └─────────────── TURN Relay (fallback) ───────────────┘          │
└─────────────────────────────────────────────────────────────────────────┘

Namespace Resolution

When multiple robots connect to the same router, topic collisions occur:

┌─────────────────────────────────────────────────────────────────────────┐
│                    NAMESPACE COLLISION PROBLEM                          │
│                                                                          │
│    Robot 1 publishes: /odom                                             │
│    Robot 2 publishes: /odom        ──►  Which /odom is which?!         │
│    Robot 3 publishes: /odom                                             │
└─────────────────────────────────────────────────────────────────────────┘

Solution: Namespace Prefixes

# Robot 1: Prefix all topics with /robot1
ros2 run demo_nodes_cpp talker --ros-args --remap __ns:=/robot1

# Robot 2: Prefix all topics with /robot2
ros2 run demo_nodes_cpp talker --ros-args --remap __ns:=/robot2 --remap __node:=talker2

# Result:
# /robot1/chatter
# /robot2/chatter
Pro Tip: Use –ros-args Instead of Environment Variables

Using --ros-args --remap __ns:=/robotX is more reliable than the ROS_NAMESPACE environment variable, especially with rmw_zenoh_cpp.

Zenoh Namespace Mapping

// zenoh_namespace.json5
{
  // Map robot-specific namespace to Zenoh key space
  ros2dds: {
    namespace: "/go2_alpha",

    // Topic remapping
    pub: {
      "/odom": "fleet/go2_alpha/odom",
      "/cmd_vel": "fleet/go2_alpha/cmd_vel"
    }
  }
}

Multi-Robot Fleet Topology

┌─────────────────────────────────────────────────────────────────────────┐
│                    FLEET MANAGEMENT TOPOLOGY                            │
│                                                                          │
│                         ┌──────────────┐                                │
│                         │ Fleet Router │                                │
│                         │   (Cloud)    │                                │
│                         └──────┬───────┘                                │
│                                │                                         │
│         ┌──────────────────────┼──────────────────────┐                 │
│         │                      │                      │                 │
│    ┌────┴────┐            ┌────┴────┐            ┌────┴────┐           │
│    │ go2_001 │            │ go2_002 │            │ go2_003 │           │
│    │ /odom   │            │ /odom   │            │ /odom   │           │
│    └─────────┘            └─────────┘            └─────────┘           │
│                                                                          │
│    Zenoh Key Space:                                                     │
│    fleet/go2_001/odom                                                   │
│    fleet/go2_002/odom                                                   │
│    fleet/go2_003/odom                                                   │
└─────────────────────────────────────────────────────────────────────────┘

Hands-On Testing: Multi-Robot Namespaces

We tested namespace resolution with two simulated robots:

# Start two "robots" with different namespaces
ros2 run demo_nodes_cpp talker --ros-args --remap __ns:=/robot1 &
ros2 run demo_nodes_cpp talker --ros-args --remap __ns:=/robot2 --remap __node:=talker2 &

# Check nodes
ros2 node list
# Output:
# /robot1/talker
# /robot2/talker2

# Check topics
ros2 topic list | grep robot
# Output:
# /robot1/chatter
# /robot2/chatter

# Verify independent streams
ros2 topic echo /robot1/chatter --once
# data: 'Hello World: 27'

ros2 topic echo /robot2/chatter --once
# data: 'Hello World: 29'
Figure 7: Multiple robots with unique namespaces avoid topic collisions
Eureka Moment #6: Namespace Resolution Prevents Chaos!
Robot Node Name Namespace Topic
Robot 1 talker /robot1 /robot1/chatter
Robot 2 talker2 /robot2 /robot2/chatter

Key Findings: - Namespaces prevent topic collisions - essential for multi-robot fleets - Each robot gets isolated topics: /robot1/odom, /robot2/odom, etc. - Fleet monitoring: Subscribe to patterns like /robot*/status


Putting It All Together

The final workshop challenge combines all 8 exercises:

┌─────────────────────────────────────────────────────────────────────────┐
│                    COMPLETE WORKSHOP SCENARIO                           │
│                                                                          │
│    Exercise 1-3: Local robot running Zenoh + SHM                        │
│    ┌─────────────────────────────────────────────────────────┐          │
│    │  Robot (local)                                          │          │
│    │  ├─ rmw_zenoh_cpp (Ex 1)                               │          │
│    │  ├─ QoS configured (Ex 2)                              │          │
│    │  └─ SHM enabled for cameras (Ex 3)                     │          │
│    └─────────────────────────────────────────────────────────┘          │
│                          │                                               │
│                          │ WiFi (Ex 6 tuning)                           │
│                          │ Priority queues (Ex 7)                       │
│                          ▼                                               │
│    Exercise 4-5: Secure cloud connectivity                              │
│    ┌─────────────────────────────────────────────────────────┐          │
│    │  Cloud Router                                           │          │
│    │  ├─ TLS encrypted (Ex 5)                               │          │
│    │  ├─ mTLS authenticated (Ex 5)                          │          │
│    │  └─ NAT traversal (Ex 8)                               │          │
│    └─────────────────────────────────────────────────────────┘          │
│                          │                                               │
│                          ▼                                               │
│    Exercise 8: Multi-robot namespace                                    │
│    ┌─────────────────────────────────────────────────────────┐          │
│    │  Fleet Dashboard                                        │          │
│    │  ├─ /go2_001/odom                                      │          │
│    │  ├─ /go2_002/odom                                      │          │
│    │  └─ /go2_003/odom                                      │          │
│    └─────────────────────────────────────────────────────────┘          │
└─────────────────────────────────────────────────────────────────────────┘

Workshop Preparation Summary

What You’ll Have Learned

Exercise Key Skill
1 Zenoh pub/sub and discovery
2 QoS configuration and compatibility
3 Shared memory for zero-copy
4 Cloud router deployment
5 mTLS security setup
6 WiFi optimization and filtering
7 Priority queues for congestion
8 Multi-robot namespace management

Final Checklist

Before attending Workshop 3:


Resources


See Also