Warctools

From COPTR
Jump to: navigation, search


Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
Homepage:https://pypi.python.org/pypi/warctools/
Source Code:https://github.com/internetarchive/warctools/
License:MIT License
Platforms:Cross-platform
Language:Python
Input Formats:WARC, ARC (Internet Archive)
Output Formats:WARC

Contents

[edit] Description

This is the most current and well-maintained Python codebase for working with WARC files. It provides a number of command-line tools for common WARC/ARC operations, and can also act as a library to create or work with WARC files directly from Python.

Pull requests and releases are currently managed by Thomas Figg, who can be contacted via Twitter.

[edit] Older Python WARC Implementations

This codebase was initially funded by IIPC and developed by Hanzo Archives. This lead to the hanzo-warc-tools package and source code.

There is also a separate warc package that was created by the Internet Archive (see source code), but is no longer in use.

Both of these projects are defunct and are now superseded by the internetarchive/warctools project.

[edit] User Experiences

[edit] Development Activity

[edit] Releases

Failed to load RSS feed from https://github.com/internetarchive/warctools/releases.atom: Error fetching URL: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

[edit] Development



Failed to load RSS feed from https://github.com/internetarchive/warctools/commits/master.atom: Error fetching URL: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

Contributors

Andy Jackson (100.0%)