MarginaliaSearch/code/processes/live-crawler/test/nu/marginalia/livecrawler/LiveCrawlDataSetTest.java
Viktor Lofgren a91ab4c203 (live-crawler) Crude first-try process for live crawling #WIP
Some refactoring is still needed, but an dummy actor is in place and a process that crawls URLs from the livecapture service's RSS endpoints; that makes it all the way to being indexable.
2024-11-19 19:35:01 +01:00

33 lines
919 B
Java

package nu.marginalia.livecrawler;
import nu.marginalia.model.EdgeUrl;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import java.nio.file.Files;
import java.nio.file.Path;
public class LiveCrawlDataSetTest {
@Test
public void testGetDataSet() throws Exception {
Path tempFile = Files.createTempFile("test", ".db");
try {
LiveCrawlDataSet dataSet = new LiveCrawlDataSet(tempFile.toString());
Assertions.assertFalse(dataSet.hasUrl("https://www.example.com/"));
dataSet.saveDocument(
1,
new EdgeUrl("https://www.example.com/"),
"test",
"test",
"test"
);
Assertions.assertTrue(dataSet.hasUrl("https://www.example.com/"));
}
finally {
Files.delete(tempFile);
}
}
}