torrent-gateway/docs/performance.md
enki b3204ea07a
Some checks are pending
CI Pipeline / Run Tests (push) Waiting to run
CI Pipeline / Lint Code (push) Waiting to run
CI Pipeline / Security Scan (push) Waiting to run
CI Pipeline / Build Docker Images (push) Blocked by required conditions
CI Pipeline / E2E Tests (push) Blocked by required conditions
first commit
2025-08-18 00:40:15 -07:00

7.9 KiB

Performance Tuning Guide

Overview

This guide covers optimizing Torrent Gateway performance for different workloads and deployment sizes.

Database Optimization

Indexes

The migration script applies performance indexes automatically:

-- File lookup optimization
CREATE INDEX idx_files_owner_pubkey ON files(owner_pubkey);
CREATE INDEX idx_files_storage_type ON files(storage_type);
CREATE INDEX idx_files_access_level ON files(access_level);
CREATE INDEX idx_files_size ON files(size);
CREATE INDEX idx_files_last_access ON files(last_access);

-- Chunk optimization
CREATE INDEX idx_chunks_chunk_hash ON chunks(chunk_hash);

-- User statistics
CREATE INDEX idx_users_storage_used ON users(storage_used);

Database Maintenance

# Run regular maintenance
./scripts/migrate.sh

# Manual optimization
sqlite3 data/metadata.db "VACUUM;"
sqlite3 data/metadata.db "ANALYZE;"

Connection Pooling

Configure connection limits in your application:

// In production config
MaxOpenConns: 25
MaxIdleConns: 5
ConnMaxLifetime: 300 * time.Second

Application Tuning

Memory Management

Go Runtime Settings:

# Set garbage collection target
export GOGC=100

# Set memory limit
export GOMEMLIMIT=2GB

Container Limits:

services:
  gateway:
    deploy:
      resources:
        limits:
          memory: 2G
        reservations:
          memory: 1G

File Handling

Large File Optimization:

  • Files >10MB use torrent storage (chunked)
  • Files <10MB use blob storage (single file)
  • Chunk size: 256KB (configurable)

Storage Path Optimization:

# Use SSD for database and small files
ln -s /fast/ssd/path data/blobs

# Use HDD for large file chunks
ln -s /bulk/hdd/path data/chunks

Network Performance

Connection Limits

Reverse Proxy (nginx):

upstream gateway {
    server 127.0.0.1:9876 max_fails=3 fail_timeout=30s;
    keepalive 32;
}

server {
    location / {
        proxy_pass http://gateway;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_buffering off;
    }
}

Rate Limiting

Configure rate limits based on usage patterns:

# In docker-compose.prod.yml
environment:
  - RATE_LIMIT_UPLOAD=10/minute
  - RATE_LIMIT_DOWNLOAD=100/minute
  - RATE_LIMIT_API=1000/minute

Storage Performance

Storage Backend Selection

Blob Storage (< 10MB files):

  • Best for: Documents, images, small media
  • Performance: Direct file system access
  • Scaling: Limited by file system performance

Torrent Storage (> 10MB files):

  • Best for: Large media, archives, datasets
  • Performance: Parallel chunk processing
  • Scaling: Horizontal scaling via chunk distribution

File System Tuning

For Linux ext4:

# Optimize for many small files
tune2fs -o journal_data_writeback /dev/sdb1
mount -o noatime,data=writeback /dev/sdb1 /data

For ZFS:

# Optimize for mixed workload
zfs set compression=lz4 tank/data
zfs set atime=off tank/data
zfs set recordsize=64K tank/data

Monitoring and Metrics

Key Metrics to Watch

Application Metrics:

  • Request rate and latency
  • Error rates by endpoint
  • Active connections
  • File upload/download rates
  • Storage usage growth

System Metrics:

  • CPU utilization
  • Memory usage
  • Disk I/O and space
  • Network throughput

Prometheus Queries

Request Rate:

rate(http_requests_total[5m])

95th Percentile Latency:

histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

Error Rate:

rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m])

Storage Growth:

increase(storage_bytes_total[24h])

Alert Thresholds

Critical Alerts:

  • Error rate > 5%
  • Response time > 5s
  • Disk usage > 90%
  • Memory usage > 85%

Warning Alerts:

  • Error rate > 1%
  • Response time > 2s
  • Disk usage > 80%
  • Memory usage > 70%

Load Testing

Running Load Tests

# Start with integration load test
go test -v -tags=integration ./test/... -run TestLoadTesting -timeout 15m

# Custom load test with specific parameters
go test -v -tags=integration ./test/... -run TestLoadTesting \
  -concurrent-users=100 \
  -test-duration=300s \
  -timeout 20m

Interpreting Results

Good Performance Indicators:

  • 95th percentile response time < 1s
  • Error rate < 0.1%
  • Throughput > 100 requests/second
  • Memory usage stable over time

Performance Bottlenecks:

  • High database response times → Add indexes or scale database
  • High CPU usage → Scale horizontally or optimize code
  • High memory usage → Check for memory leaks or add limits
  • High disk I/O → Use faster storage or optimize queries

Scaling Strategies

Vertical Scaling

Increase Resources:

services:
  gateway:
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 4G

Horizontal Scaling

Multiple Gateway Instances:

# Scale to 3 instances
docker-compose -f docker-compose.prod.yml up -d --scale gateway=3

Load Balancer Configuration:

upstream gateway_cluster {
    server 127.0.0.1:9876;
    server 127.0.0.1:9877;
    server 127.0.0.1:9878;
}

Database Scaling

Read Replicas:

  • Implement read-only database replicas
  • Route read queries to replicas
  • Use primary for writes only

Sharding Strategy:

  • Shard by user pubkey hash
  • Distribute across multiple databases
  • Implement shard-aware routing

Caching Strategies

Application-Level Caching

Redis Configuration:

redis:
  image: redis:7-alpine
  command: redis-server --maxmemory 1gb --maxmemory-policy allkeys-lru

Cache Patterns:

  • User session data (TTL: 24h)
  • File metadata (TTL: 1h)
  • API responses (TTL: 5m)
  • Authentication challenges (TTL: 10m)

CDN Integration

For public files, consider CDN integration:

  • CloudFlare for global distribution
  • AWS CloudFront for AWS deployments
  • Custom edge servers for private deployments

Configuration Tuning

Environment Variables

Production Settings:

# Application tuning
export MAX_UPLOAD_SIZE=1GB
export CHUNK_SIZE=256KB
export MAX_CONCURRENT_UPLOADS=10
export DATABASE_TIMEOUT=30s

# Performance tuning
export GOMAXPROCS=4
export GOGC=100
export GOMEMLIMIT=2GB

# Logging
export LOG_LEVEL=info
export LOG_FORMAT=json

Docker Compose Optimization

services:
  gateway:
    # Use host networking for better performance
    network_mode: host
    
    # Optimize logging
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
    
    # Resource reservations
    deploy:
      resources:
        reservations:
          memory: 512M
          cpus: '0.5'

Benchmarking

Baseline Performance Tests

# API performance
ab -n 1000 -c 10 http://localhost:9876/api/health

# Upload performance
for i in {1..10}; do
  time curl -X POST -F "file=@test/testdata/small.txt" http://localhost:9876/api/upload
done

# Download performance  
time curl -O http://localhost:9876/api/download/[hash]

Continuous Performance Monitoring

Setup automated benchmarks:

# Add to cron
0 2 * * * /path/to/performance_benchmark.sh

Track performance metrics over time:

  • Response time trends
  • Throughput capacity
  • Resource utilization patterns
  • Error rate trends

Optimization Checklist

Application Level

  • Database indexes applied
  • Connection pooling configured
  • Caching strategy implemented
  • Resource limits set
  • Garbage collection tuned

Infrastructure Level

  • Fast storage for database
  • Adequate RAM allocated
  • Network bandwidth sufficient
  • Load balancer configured
  • CDN setup for static content

Monitoring Level

  • Performance alerts configured
  • Baseline metrics established
  • Regular load testing scheduled
  • Capacity planning reviewed
  • Performance dashboards created