Difference between revisions of "The DeDuplicator (Heritrix add-on module)"
Jump to navigation
Jump to search
(Import from spreadsheet via script.) |
Prwheatley (talk | contribs) |
||
| Line 1: | Line 1: | ||
| − | {{ | + | {{Infobox tool |
|purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | |purpose=The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | ||
| − | + | |homepage=http://landsbokasafn.github.io/DeDuplicator/ | |
| − | |homepage=http:// | + | |function=Web Crawl, De-Duplication |
| − | | | + | |content=Web |
| − | | | + | }} |
| + | {{Infobox tool details | ||
| + | |ohloh_id=The DeDuplicator (Heritrix add-on module) | ||
}} | }} | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
= Description = | = Description = | ||
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls. | ||
| Line 19: | Line 15: | ||
= Development Activity = | = Development Activity = | ||
| − | |||
| − | |||
| − | |||
| − | |||
Revision as of 12:14, 21 April 2021
Description
The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.