torrent-gateway/TECHNICAL_OVERVIEW.md
enki b3204ea07a
Some checks are pending
CI Pipeline / Run Tests (push) Waiting to run
CI Pipeline / Lint Code (push) Waiting to run
CI Pipeline / Security Scan (push) Waiting to run
CI Pipeline / Build Docker Images (push) Blocked by required conditions
CI Pipeline / E2E Tests (push) Blocked by required conditions
first commit
2025-08-18 00:40:15 -07:00

437 lines
14 KiB
Markdown

# BitTorrent Gateway - Technical Overview
This document provides a comprehensive technical overview of the BitTorrent Gateway architecture, implementation details, and system design decisions.
## System Architecture
### High-Level Architecture
The BitTorrent Gateway is built as a unified system with multiple specialized components working together to provide intelligent content distribution:
```
┌─────────────────────────────────────────────────────────────┐
│ BitTorrent Gateway │
├─────────────────────┬─────────────────────┬─────────────────┤
│ Gateway Server │ Blossom Server │ DHT Node │
│ (Port 9877) │ (Port 8082) │ (Port 6883) │
│ │ │ │
│ • HTTP API │ • Blob Storage │ • Peer Discovery│
│ • WebSeed │ • Nostr Protocol │ • DHT Protocol │
│ • Rate Limiting │ • Content Address │ • Bootstrap │
│ • Abuse Prevention │ • LRU Caching │ • Announce │
└─────────────────────┴─────────────────────┴─────────────────┘
┌────────────┴────────────┐
│ Built-in Tracker │
│ │
│ • Announce/Scrape │
│ • Peer Management │
│ • Client Compatibility │
│ • Statistics Tracking │
└─────────────────────────┘
┌────────────┴────────────┐
│ P2P Coordinator │
│ │
│ • Unified Peer Discovery│
│ • Smart Peer Ranking │
│ • Load Balancing │
│ • Health Monitoring │
└─────────────────────────┘
```
### Core Components
#### 1. Gateway HTTP Server (internal/api/)
**Purpose**: Main API server and WebSeed implementation
**Port**: 9877
**Key Features**:
- RESTful API for file operations
- WebSeed (BEP-19) implementation for BitTorrent clients
- Smart proxy for reassembling chunked content
- Advanced LRU caching system
- Rate limiting and abuse prevention
**Implementation Details**:
- Built with Gorilla Mux router
- Comprehensive middleware stack (security, rate limiting, CORS)
- WebSeed with concurrent piece loading and caching
- Client-specific optimizations (qBittorrent, Transmission, etc.)
#### 2. Blossom Server (internal/blossom/)
**Purpose**: Content-addressed blob storage
**Port**: 8082
**Key Features**:
- Nostr-compatible blob storage protocol
- SHA-256 content addressing
- Direct storage for files <100MB
- Rate limiting and authentication
**Implementation Details**:
- Implements Blossom protocol specification
- Integration with gateway storage backend
- Efficient blob retrieval and caching
- Nostr event signing and verification
#### 3. DHT Node (internal/dht/)
**Purpose**: Distributed peer discovery
**Port**: 6883 (UDP)
**Key Features**:
- Full Kademlia DHT implementation
- Bootstrap connectivity to major DHT networks
- Automatic torrent announcement
- Peer discovery and sharing
**Implementation Details**:
- Custom DHT implementation with routing table management
- Integration with BitTorrent mainline DHT
- Bootstrap nodes include major public trackers
- Periodic maintenance and peer cleanup
#### 4. Built-in BitTorrent Tracker (internal/tracker/)
**Purpose**: BitTorrent announce/scrape server
**Key Features**:
- Full BitTorrent tracker protocol
- Peer management and statistics
- Client compatibility optimizations
- Abuse detection and prevention
**Implementation Details**:
- Standards-compliant announce/scrape handling
- Support for both compact and dictionary peer formats
- Client detection and protocol adjustments
- Geographic proximity-based peer selection
#### 5. P2P Coordinator (internal/p2p/)
**Purpose**: Unified management of all P2P components
**Key Features**:
- Aggregates peers from tracker, DHT, and WebSeed
- Smart peer ranking algorithm
- Load balancing across peer sources
- Health monitoring and alerting
**Implementation Details**:
- Sophisticated peer scoring system
- Geographic proximity calculation
- Performance-based peer ranking
- Automatic failover and redundancy
## Storage Architecture
### Intelligent Storage Strategy
The system uses a dual-strategy approach based on file size:
```
File Upload → Size Analysis → Storage Decision
┌───────┴───────┐
│ │
< 100MB ≥ 100MB
│ │
┌───────▼───────┐ ┌────▼────┐
│ Blob Storage │ │ Chunked │
│ │ │ Storage │
│ • Direct blob │ │ │
│ • Immediate │ │ • 2MB │
│ access │ │ chunks│
│ • No P2P │ │ • Torrent│
│ overhead │ │ + DHT │
└───────────────┘ └─────────┘
```
### Storage Backends
#### Metadata Database (SQLite)
```sql
-- File metadata
CREATE TABLE files (
hash TEXT PRIMARY KEY,
filename TEXT,
size INTEGER,
storage_type TEXT, -- 'blob' or 'chunked'
created_at DATETIME,
user_id TEXT
);
-- Torrent information
CREATE TABLE torrents (
info_hash TEXT PRIMARY KEY,
file_hash TEXT,
piece_length INTEGER,
pieces_count INTEGER,
magnet_link TEXT,
FOREIGN KEY(file_hash) REFERENCES files(hash)
);
-- Chunk mapping for large files
CREATE TABLE chunks (
file_hash TEXT,
chunk_index INTEGER,
chunk_hash TEXT,
chunk_size INTEGER,
PRIMARY KEY(file_hash, chunk_index)
);
```
#### Blob Storage
- Direct file storage in `./data/blobs/`
- SHA-256 content addressing
- Efficient for small files and frequently accessed content
- No P2P overhead - immediate availability
#### Chunk Storage
- Large files split into 2MB pieces in `./data/chunks/`
- BitTorrent-compatible piece structure
- Enables parallel downloads and partial file access
- Each chunk independently content-addressed
### Caching System
#### LRU Piece Cache
```go
type PieceCache struct {
cache map[string]*CacheEntry
lru *list.List
mutex sync.RWMutex
maxSize int64
currentSize int64
}
type CacheEntry struct {
Key string
Data []byte
Size int64
AccessTime time.Time
Element *list.Element
}
```
**Features**:
- Configurable cache size limits
- Least Recently Used eviction
- Concurrent access with read-write locks
- Cache hit ratio tracking and optimization
## P2P Integration & Coordination
### Unified Peer Discovery
The P2P coordinator aggregates peers from multiple sources:
1. **BitTorrent Tracker**: Authoritative peer list from announces
2. **DHT Network**: Distributed peer discovery across the network
3. **WebSeed**: Gateway itself as a reliable seed source
### Smart Peer Ranking Algorithm
```go
func (pr *PeerRanker) RankPeers(peers []PeerInfo, clientLocation *Location) []RankedPeer {
var ranked []RankedPeer
for _, peer := range peers {
score := pr.calculatePeerScore(peer, clientLocation)
ranked = append(ranked, RankedPeer{
Peer: peer,
Score: score,
Reason: pr.getScoreReason(peer, clientLocation),
})
}
// Sort by score (highest first)
sort.Slice(ranked, func(i, j int) bool {
return ranked[i].Score > ranked[j].Score
})
return ranked
}
```
**Scoring Factors**:
- **Geographic Proximity** (30%): Distance-based scoring
- **Source Reliability** (25%): Tracker > DHT > WebSeed fallback
- **Historical Performance** (20%): Past connection success rates
- **Load Balancing** (15%): Distribute load across available peers
- **Freshness** (10%): Recently seen peers preferred
### Health Monitoring System
#### Component Health Scoring
```go
type HealthStatus struct {
IsHealthy bool `json:"is_healthy"`
Score int `json:"score"` // 0-100
Issues []string `json:"issues"`
LastChecked time.Time `json:"last_checked"`
ResponseTime int64 `json:"response_time"` // milliseconds
Details map[string]interface{} `json:"details"`
}
```
**Weighted Health Calculation**:
- WebSeed: 40% (most critical for availability)
- Tracker: 35% (important for peer discovery)
- DHT: 25% (supplemental peer source)
#### Automatic Alerting
- Health scores below configurable threshold trigger alerts
- Multiple alert mechanisms (logs, callbacks, future integrations)
- Component-specific and overall system health monitoring
## WebSeed Implementation (BEP-19)
### Standards Compliance
The WebSeed implementation follows BEP-19 specification:
- **URL-based seeding**: BitTorrent clients can fetch pieces via HTTP
- **Range request support**: Efficient partial file downloads
- **Piece boundary alignment**: Proper handling of piece boundaries
- **Error handling**: Appropriate HTTP status codes for BitTorrent clients
### Advanced Features
#### Concurrent Request Optimization
```go
type ConcurrentRequestTracker struct {
activeRequests map[string]*RequestInfo
mutex sync.RWMutex
maxConcurrent int
}
```
- Prevents duplicate piece loads
- Manages concurrent request limits
- Request deduplication and waiting
#### Client-Specific Optimizations
```go
func (h *Handler) detectClient(userAgent string) ClientType {
switch {
case strings.Contains(userAgent, "qbittorrent"):
return ClientQBittorrent
case strings.Contains(userAgent, "transmission"):
return ClientTransmission
case strings.Contains(userAgent, "webtorrent"):
return ClientWebTorrent
// ... additional client detection
}
}
```
**Per-Client Optimizations**:
- **qBittorrent**: Standard intervals, no special handling needed
- **Transmission**: Prefers shorter announce intervals (≤30 min)
- **WebTorrent**: Short intervals for web compatibility (≤5 min)
- **uTorrent**: Minimum interval enforcement to prevent spam
## Nostr Integration
### Content Announcements
When files are uploaded, they're announced to configured Nostr relays:
```go
func (g *Gateway) announceToNostr(fileInfo *FileInfo, torrentInfo *TorrentInfo) error {
event := nostr.Event{
Kind: 1063, // Custom torrent announcement kind
Content: fmt.Sprintf("New torrent: %s", fileInfo.Filename),
CreatedAt: time.Now(),
Tags: []nostr.Tag{
{"magnet", torrentInfo.MagnetLink},
{"size", fmt.Sprintf("%d", fileInfo.Size)},
{"name", fileInfo.Filename},
{"webseed", g.getWebSeedURL(fileInfo.Hash)},
},
}
return g.nostrClient.PublishEvent(event)
}
```
### Decentralized Discovery
- Content announced to multiple Nostr relays for redundancy
- Other nodes can discover content via Nostr event subscriptions
- Enables fully decentralized content network
- No central authority or single point of failure
## Performance Optimizations
### Concurrent Processing
#### Parallel Piece Loading
```go
func (ws *WebSeedHandler) loadPieces(pieces []PieceRequest) error {
const maxConcurrency = 10
semaphore := make(chan struct{}, maxConcurrency)
var wg sync.WaitGroup
for _, piece := range pieces {
wg.Add(1)
go func(p PieceRequest) {
defer wg.Done()
semaphore <- struct{}{} // Acquire
defer func() { <-semaphore }() // Release
ws.loadSinglePiece(p)
}(piece)
}
wg.Wait()
return nil
}
```
#### Connection Pooling
- HTTP client connection reuse
- Database connection pooling
- BitTorrent connection management
- Resource cleanup and lifecycle management
## Monitoring & Observability
### Comprehensive Statistics
#### System Statistics
```go
type SystemStats struct {
Files struct {
Total int64 `json:"total"`
BlobFiles int64 `json:"blob_files"`
Torrents int64 `json:"torrents"`
TotalSize int64 `json:"total_size"`
} `json:"files"`
P2P struct {
TrackerPeers int `json:"tracker_peers"`
DHTNodes int `json:"dht_nodes"`
ActiveTorrents int `json:"active_torrents"`
} `json:"p2p"`
Performance struct {
CacheHitRatio float64 `json:"cache_hit_ratio"`
AvgResponseTime int64 `json:"avg_response_time"`
RequestsPerSec float64 `json:"requests_per_sec"`
} `json:"performance"`
}
```
### Diagnostic Endpoints
- `/api/stats` - Overall system statistics
- `/api/p2p/stats` - Detailed P2P statistics
- `/api/health` - Component health status
- `/api/diagnostics` - Comprehensive system diagnostics
- `/api/webseed/health` - WebSeed-specific health
## Conclusion
The BitTorrent Gateway represents a comprehensive solution for decentralized content distribution, combining the best aspects of traditional web hosting with peer-to-peer networks. Its modular architecture, intelligent routing, and production-ready features make it suitable for both small-scale deployments and large-scale content distribution networks.
The system's emphasis on standards compliance, security, and performance ensures reliable operation while maintaining the decentralized principles of the BitTorrent protocol. Through its unified approach to peer discovery, intelligent caching, and comprehensive monitoring, it provides a robust foundation for modern content distribution needs.