| http://www.w3.org/ns/prov#value | - it cannot persist the catalog to disk or a database - meaning that a very large site will cause a lot of memory to be used to store the catalog, and most websites contain more than just HTML pages; they also link to Microsoft Word or other Office files, Adobe Acrobat (PDF Files), and other forms of content which Searcharoo currently cannot 'understand' (i.e., parse and catalog).
|