Difference between revisions of "The DeDuplicator (Heritrix add-on module)"
Jump to navigation
Jump to search
(Trial import from script.) |
Prwheatley (talk | contribs) |
||
| (2 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
| − | {{ | + | {{Infobox tool |
| − | |purpose= The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | + | |purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. |
| − | + | |homepage=http://landsbokasafn.github.io/DeDuplicator/ | |
| − | |homepage= http:// | + | |function=De-Duplication, Web Capture |
| − | | | + | |content=Web |
| − | | | + | }} |
| + | {{Infobox tool details | ||
| + | |ohloh_id=The DeDuplicator (Heritrix add-on module) | ||
}} | }} | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
= Description = | = Description = | ||
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | ||
| Line 19: | Line 15: | ||
= Development Activity = | = Development Activity = | ||
| − | |||
| − | |||
| − | |||
| − | |||
Latest revision as of 16:32, 26 November 2021
Description
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.