Commit Graph

81 Commits

Author SHA1 Message Date
Kevin Lynx
bc00e03b33 fix sphinx xml utf8 related issure, filter these unicode control characters, only backup delta file if the operation failed 2013-08-01 23:20:28 +08:00
Kevin Lynx
5e9c36f787 add sphinx search stats 2013-07-31 22:05:53 +08:00
Kevin Lynx
0bdac737ad add a simple page navigation for sphinx_search 2013-07-31 20:57:35 +08:00
Kevin Lynx
40f2bae9b8 fix sphinx_build memory leak bug, caused by mongo_cursor 2013-07-31 12:17:21 +08:00
Kevin Lynx
149f10724e sphinx worker call infinity 2013-07-30 22:43:02 +08:00
Kevin Lynx
18edffc2a1 fix some sphinx related bugs, now it can be used to build sphinx index, still in experiment stage, add `giza' library to query sphinx in http_fontend 2013-07-30 22:17:31 +08:00
Kevin Lynx
0c67e46e5c fix daterange issure which not only record today torrents, not it only show the today inserted torrents 2013-07-23 22:16:40 +08:00
Kevin Lynx
28acbdaa45 adjust http stats display 2013-07-23 21:45:06 +08:00
Kevin Lynx
94a2ac34bc system stats adjust, add more stats to http front-end 2013-07-23 21:41:08 +08:00
Kevin Lynx
6fbd0cb218 add a new force to string log func, add log to httpd, it can log unicode characters to logfiles 2013-07-22 22:58:07 +08:00
Kevin Lynx
928798ed28 complete all http uri to json api 2013-07-22 21:23:44 +08:00
Kevin Lynx
13d35a44c1 add query stats for new hash_writer 2013-07-21 22:20:16 +08:00
Kevin Lynx
070e97e826 add hash filter stats to the new hash_reader 2013-07-21 22:10:05 +08:00
Kevin Lynx
5d211c3f14 add `size' function to hash_download_cache, to debug 2013-07-21 21:52:44 +08:00
Kevin Lynx
3864940905 fix hash_download startup bug 2013-07-21 21:32:19 +08:00
Kevin Lynx
67ff84adaa fix hash_download_cache startup bug 2013-07-21 21:30:28 +08:00
Kevin Lynx
e5b35e58ed NOTE: rewrite hash_reader, config changed, dht_hash database changed, require to remove existed dht_hash database 2013-07-21 21:13:05 +08:00
Kevin Lynx
72c35be437 change default config 2013-07-21 09:24:33 +08:00
Kevin Lynx
d00c84135b fix cache_indexer message leak bug 2013-07-20 19:37:41 +08:00
Kevin Lynx
d9deb8dfc9 add simple `get' json api, fix http search space decode 2013-07-20 10:57:27 +08:00
Kevin Lynx
ba92e9cd77 fix hash_date 2013-07-19 21:31:36 +08:00
Kevin Lynx
28fe69d141 hash_date only record today new inserted torrents 2013-07-19 21:00:37 +08:00
Kevin Lynx
45ca7d584e config max download task per hash-reader, 2013-07-18 22:03:47 +08:00
Kevin Lynx
35a131fa8f nothing 2013-07-18 14:03:34 +08:00
Kevin Lynx
976740ea57 hash_writer write cache hashes 100 by 100, not all caches 2013-07-18 13:56:51 +08:00
Kevin Lynx
928fc86934 recompile 2013-07-18 13:17:06 +08:00
Kevin Lynx
f5655ba0f3 fix hash_reader stop working bug 2013-07-18 12:38:31 +08:00
Kevin Lynx
810464330d NOTE: big change! Need to delete config files. The crawler will cache hashes and merge duplicated queries. 2013-07-17 22:55:35 +08:00
Kevin Lynx
629e92115d fix cache_indexer download bug 2013-07-17 19:11:01 +08:00
Kevin Lynx
ff338f2c9b fix cache_indexer state not saved correctly 2013-07-16 22:49:08 +08:00
Kevin Lynx
c85e216951 fix cache_indexer 2013-07-16 22:24:55 +08:00
Kevin Lynx
1ed66b3863 fix memory leak for hash_reader (message queue keep increasing), set http search result to 50 2013-07-16 21:44:16 +08:00
Kevin Lynx
ff85af0806 try to fix high cpu usage when no hash and no wait_download 2013-07-15 23:01:26 +08:00
Kevin Lynx
c5db7ae966 restore `top' cache 2013-07-15 22:14:09 +08:00
Kevin Lynx
31a1bd04c0 to avoid there's no hash and no wait_download, the hash reader may stop working 2013-07-15 22:04:41 +08:00
Kevin Lynx
d81d6a2fd2 integrate cache_index to hash_reader, default is disabled 2013-07-15 21:27:01 +08:00
Kevin Lynx
0f24428faa add cache_indexer progress displaying 2013-07-15 13:39:41 +08:00
Kevin Lynx
5153568dc9 add cache_indexer, not integrated now, see src/cache_indexer/readme.md 2013-07-14 22:59:47 +08:00
Kevin Lynx
0579304407 change hash_reader read hash/wait_download using findAndModify, to avoid the read/delete two operations 2013-07-14 15:33:46 +08:00
Kevin Lynx
552dcb9983 fix name_segger 2013-07-14 13:53:03 +08:00
Kevin Lynx
8d71c043bb fix name_seger tool 2013-07-14 11:44:32 +08:00
Kevin Lynx
40bdebc5b4 change name_seger tool to multi-processes 2013-07-14 11:17:15 +08:00
Kevin Lynx
86665cb93b only build torrent name indexes 2013-07-14 10:00:38 +08:00
Kevin Lynx
a1fc6ec3c0 add text segment config for hash_reader (text_seg), the default is simple 2013-07-13 22:27:17 +08:00
Kevin Lynx
59b54380c8 minor fix on name_seger 2013-07-13 12:05:41 +08:00
Kevin Lynx
81e184c396 remove debug info 2013-07-13 11:51:35 +08:00
Kevin Lynx
269584c708 add rmmseg, to segment chinese texts, add a tool to convert the existing torrent file names 2013-07-13 11:45:55 +08:00
Kevin Lynx
676d354515 disable numid for sphinx default 2013-07-12 10:27:23 +08:00
Kevin Lynx
6ddb9447ac Merge branch 'master' of github.com:kevinlynx/dhtcrawler2
Conflicts:
	ebin/dhtcrawler.app
	ebin/tor_download.beam
2013-07-12 09:22:09 +08:00
Kevin Lynx
f5965304f7 add torrent download stats for hash reader 2013-07-11 22:38:39 +08:00