NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search. | |
Homepage: | http://archive-access.sourceforge.net/projects/nutchwax/ |
License: | GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0. |
Platforms: | Platform-independent Java, though only tested and primarily used on Linux machines. |
Function: | Web Capture |
Content type: | Web |
NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search. NutchWAX is based on the open-source Web-search software, Nutch. Developed by Internet Archive. Written in Java.
Contributors: Ania Molenda, COPTR Bot, Prwheatley