torrent-gateway/TECHNICAL_OVERVIEW.md
enki 76979d055b
Some checks are pending
CI Pipeline / Run Tests (push) Waiting to run
CI Pipeline / Lint Code (push) Waiting to run
CI Pipeline / Security Scan (push) Waiting to run
CI Pipeline / Build Docker Images (push) Blocked by required conditions
CI Pipeline / E2E Tests (push) Blocked by required conditions
Transcoding and Nip71 update
2025-08-21 19:32:26 -07:00

19 KiB

BitTorrent Gateway - Technical Overview

This document provides a comprehensive technical overview of the BitTorrent Gateway architecture, implementation details, and system design decisions.

System Architecture

High-Level Architecture

The BitTorrent Gateway is built as a unified system with multiple specialized components working together to provide intelligent content distribution:

┌─────────────────────────────────────────────────────────────┐
│                    BitTorrent Gateway                       │
├─────────────────────┬─────────────────────┬─────────────────┤
│   Gateway Server    │   Blossom Server    │    DHT Node     │
│   (Port 9877)       │   (Port 8082)       │   (Port 6883)   │
│                     │                     │                 │
│ • HTTP API          │ • Blob Storage      │ • Peer Discovery│
│ • WebSeed           │ • Nostr Protocol    │ • DHT Protocol  │
│ • Rate Limiting     │ • Content Address   │ • Bootstrap     │
│ • Abuse Prevention  │ • LRU Caching       │ • Announce      │
│ • Video Transcoding │                     │                 │
└─────────────────────┴─────────────────────┴─────────────────┘
                                │
                   ┌────────────┴────────────┐
                   │   Built-in Tracker      │
                   │                         │
                   │ • Announce/Scrape       │
                   │ • Peer Management       │
                   │ • Client Compatibility  │
                   │ • Statistics Tracking   │
                   └─────────────────────────┘
                                │
                   ┌────────────┴────────────┐
                   │   P2P Coordinator       │
                   │                         │
                   │ • Unified Peer Discovery│
                   │ • Smart Peer Ranking    │
                   │ • Load Balancing        │
                   │ • Health Monitoring     │
                   └─────────────────────────┘

Core Components

1. Gateway HTTP Server (internal/api/)

Purpose: Main API server and WebSeed implementation Port: 9877 Key Features:

  • RESTful API for file operations
  • WebSeed (BEP-19) implementation for BitTorrent clients
  • Smart proxy for reassembling chunked content
  • Advanced LRU caching system
  • Rate limiting and abuse prevention
  • Integrated video transcoding engine

Implementation Details:

  • Built with Gorilla Mux router
  • Comprehensive middleware stack (security, rate limiting, CORS)
  • WebSeed with concurrent piece loading and caching
  • Client-specific optimizations (qBittorrent, Transmission, etc.)

2. Blossom Server (internal/blossom/)

Purpose: Content-addressed blob storage Port: 8082 Key Features:

  • Nostr-compatible blob storage protocol
  • SHA-256 content addressing
  • Direct storage for files <100MB
  • Rate limiting and authentication

Implementation Details:

  • Implements Blossom protocol specification
  • Integration with gateway storage backend
  • Efficient blob retrieval and caching
  • Nostr event signing and verification

3. DHT Node (internal/dht/)

Purpose: Distributed peer discovery Port: 6883 (UDP) Key Features:

  • Full Kademlia DHT implementation
  • Bootstrap connectivity to major DHT networks
  • Automatic torrent announcement
  • Peer discovery and sharing

Implementation Details:

  • Custom DHT implementation with routing table management
  • Integration with BitTorrent mainline DHT
  • Bootstrap nodes include major public trackers
  • Periodic maintenance and peer cleanup

4. Built-in BitTorrent Tracker (internal/tracker/)

Purpose: BitTorrent announce/scrape server Key Features:

  • Full BitTorrent tracker protocol
  • Peer management and statistics
  • Client compatibility optimizations
  • Abuse detection and prevention

Implementation Details:

  • Standards-compliant announce/scrape handling
  • Support for both compact and dictionary peer formats
  • Client detection and protocol adjustments
  • Geographic proximity-based peer selection

5. P2P Coordinator (internal/p2p/)

Purpose: Unified management of all P2P components Key Features:

  • Aggregates peers from tracker, DHT, and WebSeed
  • Smart peer ranking algorithm
  • Load balancing across peer sources
  • Health monitoring and alerting

Implementation Details:

  • Sophisticated peer scoring system
  • Geographic proximity calculation
  • Performance-based peer ranking
  • Automatic failover and redundancy

6. Video Transcoding Engine (internal/transcoding/)

Purpose: Automatic video conversion for web compatibility Key Features:

  • H.264/AAC MP4 conversion using FFmpeg
  • Background processing with priority queuing
  • Smart serving (transcoded when ready, original as fallback)
  • Progress tracking and status API endpoints
  • Configurable quality profiles and resource limits

Implementation Details:

  • Queue-based job processing with worker pools
  • Database tracking of transcoding status and progress
  • File reconstruction for chunked torrents
  • Intelligent priority system based on file size
  • Error handling and retry mechanisms

Storage Architecture

Intelligent Storage Strategy

The system uses a dual-strategy approach based on file size:

File Upload → Size Analysis → Storage Decision → Video Processing
                    │                              │
            ┌───────┴───────┐                     │
            │               │                     │
       < 100MB         ≥ 100MB                   │
            │               │                     │
    ┌───────▼───────┐  ┌────▼────┐               │
    │ Blob Storage  │  │ Chunked │               │
    │               │  │ Storage │               │
    │ • Direct blob │  │         │               │
    │ • Immediate   │  │ • 2MB   │               │
    │   access      │  │   chunks│               │
    │ • No P2P      │  │ • Torrent│               │
    │   overhead    │  │   + DHT  │               │
    └───────────────┘  └─────────┘               │
                           │                     │
                    ┌──────┴─────────────────────▼──┐
                    │        Video Analysis         │
                    │                               │
                    │ • Format Detection           │
                    │ • Transcoding Queue          │
                    │ • Priority Assignment        │
                    │ • Background Processing      │
                    └───────────────────────────────┘

Storage Backends

Metadata Database (SQLite)

-- File metadata
CREATE TABLE files (
    hash TEXT PRIMARY KEY,
    filename TEXT,
    size INTEGER,
    storage_type TEXT, -- 'blob' or 'chunked'
    created_at DATETIME,
    user_id TEXT
);

-- Torrent information
CREATE TABLE torrents (
    info_hash TEXT PRIMARY KEY,
    file_hash TEXT,
    piece_length INTEGER,
    pieces_count INTEGER,
    magnet_link TEXT,
    FOREIGN KEY(file_hash) REFERENCES files(hash)
);

-- Chunk mapping for large files
CREATE TABLE chunks (
    file_hash TEXT,
    chunk_index INTEGER,
    chunk_hash TEXT,
    chunk_size INTEGER,
    PRIMARY KEY(file_hash, chunk_index)
);

-- Transcoding job tracking
CREATE TABLE transcoding_status (
    file_hash TEXT PRIMARY KEY,
    status TEXT NOT NULL,
    error_message TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

Blob Storage

  • Direct file storage in ./data/blobs/
  • SHA-256 content addressing
  • Efficient for small files and frequently accessed content
  • No P2P overhead - immediate availability

Chunk Storage

  • Large files split into 2MB pieces in ./data/chunks/
  • BitTorrent-compatible piece structure
  • Enables parallel downloads and partial file access
  • Each chunk independently content-addressed

Transcoded Storage

  • Processed video files stored in ./data/transcoded/
  • Organized by original file hash subdirectories
  • H.264/AAC MP4 format for universal web compatibility
  • Smart serving prioritizes transcoded versions when available

Caching System

LRU Piece Cache

type PieceCache struct {
    cache map[string]*CacheEntry
    lru   *list.List
    mutex sync.RWMutex
    maxSize int64
    currentSize int64
}

type CacheEntry struct {
    Key string
    Data []byte
    Size int64
    AccessTime time.Time
    Element *list.Element
}

Features:

  • Configurable cache size limits
  • Least Recently Used eviction
  • Concurrent access with read-write locks
  • Cache hit ratio tracking and optimization

Video Transcoding System

Architecture Overview

The transcoding system provides automatic video conversion for web compatibility:

type TranscodingEngine struct {
    // Core Components
    Transcoder    *Transcoder     // FFmpeg integration
    Manager       *Manager        // Job coordination
    WorkerPool    chan Job        // Background processing
    Database      *sql.DB         // Status tracking
    
    // Configuration
    ConcurrentJobs int            // Parallel workers
    WorkDirectory  string         // Processing workspace
    QualityProfiles []Quality     // Output formats
}

Processing Pipeline

  1. Upload Detection: Video files automatically identified during upload
  2. Queue Decision: Files ≥50MB queued for transcoding with priority based on size
  3. File Reconstruction: Chunked torrents reassembled into temporary files
  4. FFmpeg Processing: H.264/AAC conversion with web optimization flags
  5. Smart Serving: Transcoded versions served when ready, originals as fallback

Transcoding Manager

func (tm *Manager) QueueVideoForTranscoding(fileHash, fileName, filePath string, fileSize int64) {
    // Check if already processed
    if tm.HasTranscodedVersion(fileHash) {
        return
    }
    
    // Analyze file format
    needsTranscoding, err := tm.transcoder.NeedsTranscoding(filePath)
    if !needsTranscoding {
        tm.markAsWebCompatible(fileHash)
        return
    }
    
    // Create prioritized job
    job := Job{
        ID:        fmt.Sprintf("transcode_%s", fileHash),
        InputPath: filePath,
        OutputDir: filepath.Join(tm.transcoder.workDir, fileHash),
        Priority:  tm.calculatePriority(fileSize),
        Callback:  tm.jobCompletionHandler,
    }
    
    tm.transcoder.SubmitJob(job)
    tm.markTranscodingQueued(fileHash)
}

Smart Priority System

  • High Priority (8): Files < 500MB for faster user feedback
  • Medium Priority (5): Standard processing queue
  • Low Priority (2): Files > 5GB to prevent resource monopolization

Status API Integration

Users can track transcoding progress via authenticated endpoints:

  • /api/users/me/files/{hash}/transcoding-status - Real-time status and progress
  • Response includes job status, progress percentage, and transcoded file availability

P2P Integration & Coordination

Unified Peer Discovery

The P2P coordinator aggregates peers from multiple sources:

  1. BitTorrent Tracker: Authoritative peer list from announces
  2. DHT Network: Distributed peer discovery across the network
  3. WebSeed: Gateway itself as a reliable seed source

Smart Peer Ranking Algorithm

func (pr *PeerRanker) RankPeers(peers []PeerInfo, clientLocation *Location) []RankedPeer {
    var ranked []RankedPeer
    
    for _, peer := range peers {
        score := pr.calculatePeerScore(peer, clientLocation)
        ranked = append(ranked, RankedPeer{
            Peer: peer,
            Score: score,
            Reason: pr.getScoreReason(peer, clientLocation),
        })
    }
    
    // Sort by score (highest first)
    sort.Slice(ranked, func(i, j int) bool {
        return ranked[i].Score > ranked[j].Score
    })
    
    return ranked
}

Scoring Factors:

  • Geographic Proximity (30%): Distance-based scoring
  • Source Reliability (25%): Tracker > DHT > WebSeed fallback
  • Historical Performance (20%): Past connection success rates
  • Load Balancing (15%): Distribute load across available peers
  • Freshness (10%): Recently seen peers preferred

Health Monitoring System

Component Health Scoring

type HealthStatus struct {
    IsHealthy     bool      `json:"is_healthy"`
    Score         int       `json:"score"`          // 0-100
    Issues        []string  `json:"issues"`
    LastChecked   time.Time `json:"last_checked"`
    ResponseTime  int64     `json:"response_time"`  // milliseconds
    Details       map[string]interface{} `json:"details"`
}

Weighted Health Calculation:

  • WebSeed: 40% (most critical for availability)
  • Tracker: 35% (important for peer discovery)
  • DHT: 25% (supplemental peer source)

Automatic Alerting

  • Health scores below configurable threshold trigger alerts
  • Multiple alert mechanisms (logs, callbacks, future integrations)
  • Component-specific and overall system health monitoring

WebSeed Implementation (BEP-19)

Standards Compliance

The WebSeed implementation follows BEP-19 specification:

  • URL-based seeding: BitTorrent clients can fetch pieces via HTTP
  • Range request support: Efficient partial file downloads
  • Piece boundary alignment: Proper handling of piece boundaries
  • Error handling: Appropriate HTTP status codes for BitTorrent clients

Advanced Features

Concurrent Request Optimization

type ConcurrentRequestTracker struct {
    activeRequests map[string]*RequestInfo
    mutex          sync.RWMutex
    maxConcurrent  int
}
  • Prevents duplicate piece loads
  • Manages concurrent request limits
  • Request deduplication and waiting

Client-Specific Optimizations

func (h *Handler) detectClient(userAgent string) ClientType {
    switch {
    case strings.Contains(userAgent, "qbittorrent"):
        return ClientQBittorrent
    case strings.Contains(userAgent, "transmission"):
        return ClientTransmission
    case strings.Contains(userAgent, "webtorrent"):
        return ClientWebTorrent
    // ... additional client detection
    }
}

Per-Client Optimizations:

  • qBittorrent: Standard intervals, no special handling needed
  • Transmission: Prefers shorter announce intervals (≤30 min)
  • WebTorrent: Short intervals for web compatibility (≤5 min)
  • uTorrent: Minimum interval enforcement to prevent spam

Nostr Integration

Content Announcements

When files are uploaded, they're announced to configured Nostr relays:

func (g *Gateway) announceToNostr(fileInfo *FileInfo, torrentInfo *TorrentInfo) error {
    event := nostr.Event{
        Kind:      2003, // NIP-35 torrent announcement kind
        Content:   fmt.Sprintf("New torrent: %s", fileInfo.Filename),
        CreatedAt: time.Now(),
        Tags: []nostr.Tag{
            {"magnet", torrentInfo.MagnetLink},
            {"size", fmt.Sprintf("%d", fileInfo.Size)},
            {"name", fileInfo.Filename},
            {"webseed", g.getWebSeedURL(fileInfo.Hash)},
        },
    }
    
    return g.nostrClient.PublishEvent(event)
}

Decentralized Discovery

  • Content announced to multiple Nostr relays for redundancy
  • Other nodes can discover content via Nostr event subscriptions
  • Enables fully decentralized content network
  • No central authority or single point of failure

Performance Optimizations

Concurrent Processing

Parallel Piece Loading

func (ws *WebSeedHandler) loadPieces(pieces []PieceRequest) error {
    const maxConcurrency = 10
    semaphore := make(chan struct{}, maxConcurrency)
    var wg sync.WaitGroup
    
    for _, piece := range pieces {
        wg.Add(1)
        go func(p PieceRequest) {
            defer wg.Done()
            semaphore <- struct{}{}        // Acquire
            defer func() { <-semaphore }() // Release
            
            ws.loadSinglePiece(p)
        }(piece)
    }
    
    wg.Wait()
    return nil
}

Connection Pooling

  • HTTP client connection reuse
  • Database connection pooling
  • BitTorrent connection management
  • Resource cleanup and lifecycle management

Monitoring & Observability

Comprehensive Statistics

System Statistics

type SystemStats struct {
    Files struct {
        Total     int64 `json:"total"`
        BlobFiles int64 `json:"blob_files"`
        Torrents  int64 `json:"torrents"`
        TotalSize int64 `json:"total_size"`
    } `json:"files"`
    
    P2P struct {
        TrackerPeers int   `json:"tracker_peers"`
        DHTNodes     int   `json:"dht_nodes"`
        ActiveTorrents int `json:"active_torrents"`
    } `json:"p2p"`
    
    Performance struct {
        CacheHitRatio  float64 `json:"cache_hit_ratio"`
        AvgResponseTime int64  `json:"avg_response_time"`
        RequestsPerSec float64 `json:"requests_per_sec"`
    } `json:"performance"`
}

Diagnostic Endpoints

  • /api/stats - Overall system statistics
  • /api/p2p/stats - Detailed P2P statistics
  • /api/health - Component health status
  • /api/diagnostics - Comprehensive system diagnostics
  • /api/webseed/health - WebSeed-specific health
  • /api/users/me/files/{hash}/transcoding-status - Video transcoding progress

Conclusion

The BitTorrent Gateway represents a comprehensive solution for decentralized content distribution, combining the best aspects of traditional web hosting with peer-to-peer networks and modern video processing capabilities. Its modular architecture, intelligent routing, automatic transcoding, and production-ready features make it suitable for both small-scale deployments and large-scale content distribution networks.

The system's emphasis on standards compliance, security, performance, and user experience ensures reliable operation while maintaining the decentralized principles of the BitTorrent protocol. Through its unified approach to peer discovery, intelligent caching, automatic video optimization, and comprehensive monitoring, it provides a robust foundation for modern multimedia content distribution needs.