The DeDuplicator (Heritrix add-on module)

From COPTR
Revision as of 17:41, 12 November 2013 by COPTR Bot (talk | contribs) (Trial import from script.)
Jump to navigation Jump to search
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.
Homepage:http://deduplicator.sourceforge.net/


Description

The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.

User Experiences

Development Activity

Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6742d0b65e8958_53192964