diff --git a/code/features-index/lexicon/readme.md b/code/features-index/lexicon/readme.md
index bdd2f8f0..c4f0225c 100644
--- a/code/features-index/lexicon/readme.md
+++ b/code/features-index/lexicon/readme.md
@@ -1,7 +1,11 @@
# Lexicon
-The lexicon contains a mapping for words to identifiers. This lexicon is populated from a journal.
-The actual word data isn't mapped, but rather a 64 bit hash.
+The lexicon contains a mapping for words to identifiers.
+
+To ease index construction, it makes calculations easier if the domain of word identifiers is dense, that is, there is no gaps between ids; if there are 100 words, they're indexed 0-99 and not 5, 23, 107, 9999, 819235 etc. The lexicon exists to create such a mapping.
+
+This lexicon is populated from a journal. The actual word data isn't mapped, but rather a 64 bit hash. As a result of the birthday paradox, colissions will be rare up until about to 232 words.
+
The lexicon is constructed by [processes/loading-process](../../processes/loading-process) and read when
[services-core/index-service](../../services-core/index-service) interprets queries.