Difference between revisions of "NutchWAX"

From COPTR
Jump to navigation Jump to search
 
Line 4: Line 4:
 
|license=GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0.
 
|license=GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0.
 
|platforms=Platform-independent Java, though only tested and primarily used on Linux machines.
 
|platforms=Platform-independent Java, though only tested and primarily used on Linux machines.
|function=Web Crawl
+
|function=Web Capture
 
|content=Web
 
|content=Web
 
}}
 
}}

Latest revision as of 16:06, 26 November 2021



NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
Homepage:http://archive-access.sourceforge.net/projects/nutchwax/
License:GNU Lesser General Public License 2.1; Nutch itself is under Apache License 2.0.
Platforms:Platform-independent Java, though only tested and primarily used on Linux machines.
Function:Web Capture
Content type:Web



Description[edit]

NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search. NutchWAX is based on the open-source Web-search software, Nutch. Developed by Internet Archive. Written in Java.

User Experiences[edit]

Development Activity[edit]