NutchWAX

From COPTR
Revision as of 16:23, 22 April 2021 by Ania Molenda (talk | contribs)
Jump to navigation Jump to search



NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
Homepage:http://archive-access.sourceforge.net/projects/nutchwax/
License:GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0.
Platforms:Platform-independent Java, though only tested and primarily used on Linux machines.
Function:Web Crawl
Content type:Web


Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt6629a0e7e40845_30052851


Description

NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search. NutchWAX is based on the open-source Web-search software, Nutch. Developed by Internet Archive. Written in Java.

User Experiences

Development Activity