NutchWAX

From COPTR
Revision as of 16:23, 22 April 2021 by Ania Molenda (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
Homepage:http://archive-access.sourceforge.net/projects/nutchwax/
License:GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0.
Platforms:Platform-independent Java, though only tested and primarily used on Linux machines.
Function:Web Crawl
Content type:Web



Description[edit]

NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search. NutchWAX is based on the open-source Web-search software, Nutch. Developed by Internet Archive. Written in Java.

User Experiences[edit]

Development Activity[edit]