Kevin Lynx
|
810464330d
|
NOTE: big change! Need to delete config files. The crawler will cache hashes and merge duplicated queries.
|
2013-07-17 22:55:35 +08:00 |
|
Kevin Lynx
|
629e92115d
|
fix cache_indexer download bug
|
2013-07-17 19:11:01 +08:00 |
|
Kevin Lynx
|
ff338f2c9b
|
fix cache_indexer state not saved correctly
|
2013-07-16 22:49:08 +08:00 |
|
Kevin Lynx
|
c85e216951
|
fix cache_indexer
|
2013-07-16 22:24:55 +08:00 |
|
Kevin Lynx
|
1ed66b3863
|
fix memory leak for hash_reader (message queue keep increasing), set http search result to 50
|
2013-07-16 21:44:16 +08:00 |
|
Kevin Lynx
|
ff85af0806
|
try to fix high cpu usage when no hash and no wait_download
|
2013-07-15 23:01:26 +08:00 |
|
Kevin Lynx
|
c5db7ae966
|
restore `top' cache
|
2013-07-15 22:14:09 +08:00 |
|
Kevin Lynx
|
31a1bd04c0
|
to avoid there's no hash and no wait_download, the hash reader may stop working
|
2013-07-15 22:04:41 +08:00 |
|
Kevin Lynx
|
d81d6a2fd2
|
integrate cache_index to hash_reader, default is disabled
|
2013-07-15 21:27:01 +08:00 |
|
Kevin Lynx
|
0f24428faa
|
add cache_indexer progress displaying
|
2013-07-15 13:39:41 +08:00 |
|
Kevin Lynx
|
5153568dc9
|
add cache_indexer, not integrated now, see src/cache_indexer/readme.md
|
2013-07-14 22:59:47 +08:00 |
|
Kevin Lynx
|
0579304407
|
change hash_reader read hash/wait_download using findAndModify, to avoid the read/delete two operations
|
2013-07-14 15:33:46 +08:00 |
|
Kevin Lynx
|
552dcb9983
|
fix name_segger
|
2013-07-14 13:53:03 +08:00 |
|
Kevin Lynx
|
8d71c043bb
|
fix name_seger tool
|
2013-07-14 11:44:32 +08:00 |
|
Kevin Lynx
|
40bdebc5b4
|
change name_seger tool to multi-processes
|
2013-07-14 11:17:15 +08:00 |
|
Kevin Lynx
|
86665cb93b
|
only build torrent name indexes
|
2013-07-14 10:00:38 +08:00 |
|
Kevin Lynx
|
a1fc6ec3c0
|
add text segment config for hash_reader (text_seg), the default is simple
|
2013-07-13 22:27:17 +08:00 |
|
Kevin Lynx
|
59b54380c8
|
minor fix on name_seger
|
2013-07-13 12:05:41 +08:00 |
|
Kevin Lynx
|
81e184c396
|
remove debug info
|
2013-07-13 11:51:35 +08:00 |
|
Kevin Lynx
|
269584c708
|
add rmmseg, to segment chinese texts, add a tool to convert the existing torrent file names
|
2013-07-13 11:45:55 +08:00 |
|
Kevin Lynx
|
676d354515
|
disable numid for sphinx default
|
2013-07-12 10:27:23 +08:00 |
|
Kevin Lynx
|
6ddb9447ac
|
Merge branch 'master' of github.com:kevinlynx/dhtcrawler2
Conflicts:
ebin/dhtcrawler.app
ebin/tor_download.beam
|
2013-07-12 09:22:09 +08:00 |
|
Kevin Lynx
|
f5965304f7
|
add torrent download stats for hash reader
|
2013-07-11 22:38:39 +08:00 |
|
Kevin Lynx
|
1320002674
|
integrate torrent downloader monitor, change http today_top to show the today request count, instead total request count, remove ibrowse initial config
|
2013-07-11 22:01:47 +08:00 |
|
Kevin Lynx
|
5a0b21c7b0
|
chang http top query, add a new database to map date to hashes, to support query by date range
|
2013-07-11 20:35:16 +08:00 |
|
Kevin Lynx
|
cda02229ad
|
add tor_download req monitor, not integrated yet
|
2013-07-11 17:50:32 +08:00 |
|
Kevin Lynx
|
42b32810c6
|
torbuilder(importer) fix badarith bug when there're invalid name torrent
files
|
2013-07-11 09:06:58 +08:00 |
|
Kevin Lynx
|
4adc0a7df4
|
change http start to pass dbpool size
|
2013-07-09 22:38:34 +08:00 |
|
Kevin Lynx
|
9982141e7b
|
change hash_reader shell startup script
|
2013-07-09 21:50:12 +08:00 |
|
Kevin Lynx
|
164c0f0f21
|
change inc_announce response fileds to empty (only _id)
|
2013-07-09 21:43:26 +08:00 |
|
Kevin Lynx
|
aa7e8bb18a
|
use safe insert in torrent importer
|
2013-07-09 16:58:50 +08:00 |
|
Kevin Lynx
|
40dbbeb581
|
fix hash_reader stop working bug when there's only wait_download hash
|
2013-07-09 15:06:13 +08:00 |
|
Kevin Lynx
|
03f98c35be
|
update torrent importer, save process state so that can launch next time
|
2013-07-08 22:16:26 +08:00 |
|
Kevin Lynx
|
0c6097f130
|
separate embedded css style to page.temp
|
2013-07-08 12:43:27 +08:00 |
|
Kevin Lynx
|
00cfa07d88
|
add a tool to add local torrents to database
|
2013-07-07 19:52:20 +08:00 |
|
Kevin Lynx
|
b42a85e929
|
fix hash_reader stop bug (send a wrong message)
|
2013-07-06 21:24:44 +08:00 |
|
Kevin Lynx
|
d17d78ec08
|
add config to control whether load torrents from database cache
|
2013-07-06 09:41:51 +08:00 |
|
Kevin Lynx
|
cedd378fa6
|
IMPORTANT: add number id to torrent database
|
2013-07-06 01:19:06 +08:00 |
|
Kevin Lynx
|
01f7fd1aaf
|
stats nice print
|
2013-07-05 23:49:35 +08:00 |
|
Kevin Lynx
|
48022521c9
|
change torrent path config name
|
2013-07-05 23:08:26 +08:00 |
|
Kevin Lynx
|
ad8c931c80
|
stats on torrent cache used
|
2013-07-05 22:52:15 +08:00 |
|
Kevin Lynx
|
dc8cab211f
|
fix torrent download memory leak bug
|
2013-07-05 21:53:22 +08:00 |
|
Kevin Lynx
|
7c777f64b6
|
add torrent downloader
|
2013-07-05 21:07:35 +08:00 |
|
Kevin Lynx
|
b598fb02ea
|
add torrent cache, not tested now
|
2013-07-03 22:55:17 +08:00 |
|
Kevin Lynx
|
7b811ddd32
|
add http short keyword checking
|
2013-07-03 22:35:12 +08:00 |
|
Kevin Lynx
|
0fcf9f8dda
|
make the cache updating process only one, and make stats to cache
|
2013-07-03 21:58:06 +08:00 |
|
Kevin Lynx
|
10c11df943
|
fix http cache bug
|
2013-07-03 20:02:28 +08:00 |
|
Kevin Lynx
|
d82839ad27
|
async update cache
|
2013-07-03 17:53:42 +08:00 |
|
Kevin Lynx
|
4ca1885320
|
update http cache in a process
|
2013-07-03 17:46:15 +08:00 |
|
Kevin Lynx
|
c1f99eccd5
|
add http search cache
|
2013-07-03 17:19:28 +08:00 |
|