mirror of
https://github.com/MarginaliaSearch/MarginaliaSearch.git
synced 2025-02-24 05:18:58 +00:00
![]() Add a toggle for saving the WARC data generated by the search engine's crawler. Normally this is discarded, but for debugging or archival purposes, retaining it may be of interest. The warc files are concatenated into larger archives, up to about 1 GB each. An index is also created containing filenames, domain names, offsets and sizes to help navigate these larger archives. The warc data is saved in a directory warc/ under the crawl data storage. |
||
---|---|---|
.. | ||
src/main | ||
build.gradle | ||
readme.md |
Service
Contains the base classes for the services. This is where port configuration, and common endpoints are set up.
Creating a new Service
The minimal service needs a MainClass
and a Service
class.
For proper initiation, the main class should look like this:
public class FoobarMain extends MainClass {
@Inject
public FoobarMain(FoobarService service) {}
public static void main(String... args) {
init(ServiceId.Foobar, args);
Injector injector = Guice.createInjector(
new FoobarModule(), /* optional custom bindings go here */
new DatabaseModule(),
new ConfigurationModule(SearchServiceDescriptors.descriptors,
ServiceId.Foobar));
injector.getInstance(FoobarMain.class);
// set the service as ready so that delayed tasks can be started
injector.getInstance(Initialization.class).setReady();
}
}
A service class has a boilerplate set-up that looks like this:
@Singleton
public class FoobarService extends Service {
@Inject
public FoobarService(BaseServiceParams params) {
super(params);
// set up Spark endpoints here
}
}
Further the new service needs to be added to the ServiceId
enum in service-discovery.