WARC
Jump to navigation
Jump to search
Tools that have this format as input[edit]
Tool | Purpose |
---|---|
JHOVE (Harvard Object Validation Environment) | JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects. |
TweetSets | TweetSets provides a web interface that allows users to (1) select from existing datasets; (2) limit the dataset by querying on keywords, hashtags, and other parameters; (3) generate and download dataset derivatives such as the list of tweet ids and mention nodes/edges. |
Warc Analyzer | A proof-of-concept client side webapp for analyzing WARC data using Webrecorder's warcio.js. No WARC data is uploaded anywhere it runs on your machine. The idea is that it would be useful for archivists who have been given a pile of WARC data and they would like to quickly know what it contains. |
Warc-proxy | Warc-proxy is a simple tool to view WARC content in Firefox |
Warctools | Command line tools and libraries for handling and manipulating WARC files (and HTTP contents) |
Tools that have this format as output[edit]
Tool | Purpose |
---|---|
ArchiveBox | ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data. |
Perma.cc | A tool that captures, stores, plays-back and provides a new URL for web citation. Built and maintained at the Harvard Law School Library. |
SFM (Social Feed Manager) | Social Feed Manager is open source software that provides a web interface to enable users to harvest social media data and web resources from Twitter and other social media platforms. |
WARCreate | Google Chrome browser extension for creating WARC files from web pages |
Warcit | Warcit is a command-line tool that converts directories (including nested directories), files (including HTML or other web assets and data files) and ZIP files to Web Archives (WARC). |
Warctools | Command line tools and libraries for handling and manipulating WARC files (and HTTP contents) |