FIGS. 7A-7B are a flow diagram of an illustrative routine for identifying candidate duplicate documents for a submitted document and filtering out false duplicates from the candidate documents, in accordance with one or more embodiments of the disclosed subject matter;