Difference between revisions of "The DeDuplicator (Heritrix add-on module)"
Jump to navigation
Jump to search
(Import from spreadsheet via script.) |
Prwheatley (talk | contribs) |
||
Line 1: | Line 1: | ||
− | {{ | + | {{Infobox tool |
|purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | |purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | ||
− | + | |homepage=http://landsbokasafn.github.io/DeDuplicator/ | |
− | |homepage=http:// | + | |function=Web Crawl, De-Duplication |
− | | | + | |content=Web |
− | | | + | }} |
+ | {{Infobox tool details | ||
+ | |ohloh_id=The DeDuplicator (Heritrix add-on module) | ||
}} | }} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
= Description = | = Description = | ||
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | ||
Line 19: | Line 15: | ||
= Development Activity = | = Development Activity = | ||
− | |||
− | |||
− | |||
− |
Revision as of 12:14, 21 April 2021
Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt673f91d2a855e1_92095648
Description
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.