mirror of
https://github.com/ekzhang/bore.git
synced 2026-04-18 14:40:21 +00:00
Implement automatic client reconnection with exponential backoff and heartbeat timeout
- Add heartbeat timeout to client control connection using server heartbeats for dead connection detection - Introduce exponential backoff with jitter for reconnection delays - Add CLI flags: --no-reconnect to disable auto-reconnect, --max-reconnect-delay to configure backoff cap - Classify authentication errors as fatal (never retried), all others retried automatically - Configure TCP keepalive on control connections for OS-level dead connection detection - Update documentation (README.md, CLAUDE.md) to describe reconnection behavior and new flags - Add unit tests for backoff logic and error classification
This commit is contained in:
parent
042fa78742
commit
a13e03372e
9 changed files with 438 additions and 126 deletions
11
CLAUDE.md
11
CLAUDE.md
|
|
@ -23,10 +23,10 @@ The codebase is ~400 lines of async Rust using Tokio. No unsafe code (`#![forbid
|
|||
|
||||
### Modules
|
||||
|
||||
- **`main.rs`** — CLI entry point using clap. Two subcommands: `local` (client) and `server`.
|
||||
- **`shared.rs`** — Protocol definitions. `ClientMessage`/`ServerMessage` enums serialized as JSON over TCP with null-byte delimiters. `Delimited<U>` wraps any async stream for framed JSON I/O. Key constants: `CONTROL_PORT = 7835`, `MAX_FRAME_LENGTH = 256`, `NETWORK_TIMEOUT = 3s`.
|
||||
- **`main.rs`** — CLI entry point using clap. Two subcommands: `local` (client) and `server`. The `local` subcommand includes a reconnection loop with exponential backoff (enabled by default, disable with `--no-reconnect`). Authentication errors are classified as fatal via `is_auth_error()` and never retried.
|
||||
- **`shared.rs`** — Protocol definitions. `ClientMessage`/`ServerMessage` enums serialized as JSON over TCP with null-byte delimiters. `Delimited<U>` wraps any async stream for framed JSON I/O. Key constants: `CONTROL_PORT = 7835`, `MAX_FRAME_LENGTH = 256`, `NETWORK_TIMEOUT = 3s`, `HEARTBEAT_TIMEOUT = 8s`. Also contains `ExponentialBackoff` for reconnection delays and `set_tcp_keepalive()` for OS-level dead connection detection.
|
||||
- **`auth.rs`** — Optional HMAC-SHA256 challenge-response authentication. Secret is SHA256-hashed before use. Constant-time comparison.
|
||||
- **`client.rs`** — `Client` connects to server's control port, sends `Hello(port)`, receives assigned port. For each incoming `Connection(uuid)`, opens a new TCP connection, sends `Accept(uuid)`, then bidirectionally proxies between local service and tunnel.
|
||||
- **`client.rs`** — `Client` connects to server's control port, sends `Hello(port)`, receives assigned port. The `listen()` method wraps `recv()` in a heartbeat timeout (8s) to detect dead connections, returning an error instead of blocking forever. TCP keepalive is set on the control connection. For each incoming `Connection(uuid)`, opens a new TCP connection, sends `Accept(uuid)`, then bidirectionally proxies between local service and tunnel.
|
||||
- **`server.rs`** — `Server` listens on control port. Allocates tunnel ports (random selection, 150 attempts). Stores pending connections in `DashMap<Uuid, TcpStream>` with 10-second expiry. Sends heartbeats every 500ms.
|
||||
|
||||
### Protocol Flow
|
||||
|
|
@ -36,6 +36,7 @@ The codebase is ~400 lines of async Rust using Tokio. No unsafe code (`#![forbid
|
|||
3. Client sends `Hello(desired_port)`, server responds with `Hello(actual_port)` and starts tunnel listener
|
||||
4. When external traffic hits the tunnel port, server stores the connection by UUID, sends `Connection(uuid)` to client
|
||||
5. Client opens a new connection to server, sends `Accept(uuid)`, server pairs streams, bidirectional copy begins
|
||||
6. If the control connection drops (heartbeat timeout or EOF), the client reconnects automatically with exponential backoff (unless `--no-reconnect` is set)
|
||||
|
||||
### Key Patterns
|
||||
|
||||
|
|
@ -44,6 +45,10 @@ The codebase is ~400 lines of async Rust using Tokio. No unsafe code (`#![forbid
|
|||
- `Arc<Client>`/`Arc<Server>` shared across spawned Tokio tasks
|
||||
- `tokio::io::copy_bidirectional` for efficient TCP proxying
|
||||
- `anyhow::Result` with `.context()` for error propagation
|
||||
- Heartbeat timeout on client `listen()` loop to detect dead connections (8s timeout, server heartbeats every 500ms)
|
||||
- Exponential backoff with jitter for reconnection delays (1s base, configurable max)
|
||||
- TCP keepalive via `socket2` as defense-in-depth for dead connection detection
|
||||
- String-based error classification (`is_auth_error()`) to distinguish fatal from retriable errors
|
||||
|
||||
## Testing
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue