mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-23 21:18:58 +00:00
![]() This is a first step of using WARC as an intermediate flight recorder style step in the crawler, ultimately aimed at being able to resume crawls if the crawler is restarted. This component is currently not hooked into anything. The OkHttp3 client wrapper class 'WarcRecordingFetcherClient' was implemented for web archiving. This allows for the recording of HTTP requests and responses. New classes were introduced, 'WarcDigestBuilder', 'IpInterceptingNetworkInterceptor', and 'WarcProtocolReconstructor'. The JWarc dependency was added to the build.gradle file, and relevant unit tests were also introduced. Some HttpFetcher-adjacent structural changes were also done for better organization. |
||
---|---|---|
.. | ||
config | ||
db | ||
linkdb | ||
model | ||
process | ||
renderer | ||
service | ||
service-client | ||
service-discovery | ||
readme.md |
Common
These are packages containing the basic building blocks for running a service as well as shared models.
- db contains SQL code and some database-related utilities.
- config contains some
@Inject
ables. - renderer contains utility code for rendering website templates.
- service is the shared base classes for main methods and web services.
- service-client is the shared base class for RPC.
- service-discovery contains tools that lets the services find each other.
- process contains boiler plate for batch processes.