dhtcrawler is a DHT crawler written in erlang. It can join a DHT network and crawl many P2P torrents. The program save all torrent info into database and provide an http interface to search a torrent by a keyword
Go to file
2013-07-02 13:56:44 +08:00
include first commit 2013-07-01 23:06:18 +08:00
priv first commit 2013-07-01 23:06:18 +08:00
src change http to read data from mongodb slave db in repla set, can still 2013-07-02 13:56:44 +08:00
tools change http to read data from mongodb slave db in repla set, can still 2013-07-02 13:56:44 +08:00
www change http interfaces (modify the database query), add some config 2013-07-02 11:03:01 +08:00
.gitignore first commit 2013-07-01 23:06:18 +08:00
Emakefile first commit 2013-07-01 23:06:18 +08:00
README.md first commit 2013-07-01 23:06:18 +08:00
rebar first commit 2013-07-01 23:06:18 +08:00
rebar.cmd first commit 2013-07-01 23:06:18 +08:00
rebar.config first commit 2013-07-01 23:06:18 +08:00

dhtcrawler

dhtcrawler is a DHT crawler written in erlang. It can join a DHT network and crawl many P2P torrents. The program save all torrent info into database and provide an http interface to search a torrent by a keyword.

screenshot

Usage

  • Download mongodb and start it with text search, i.e:

      mongod --dbpath db --setParameter textSearchEnabled=true
    
  • Download dhtcrawler source code

  • Use rebar to download and install all dependent libraries

      rebar get-deps
    
  • compile

      rebar compile
    
  • start dhtcrawler

      crawler_app:start()
    
  • start the http front-end

      crawler_http:start().
    
  • Open a web browser and point to localhost:8000/index.html

Config

see priv/dhtcrawler.config.

NOTE, when you change node_count value in dhtcrawler.config, you'd better delete all files saved in dhtstate directory.