Difference between revisions of "Warctools"
| (One intermediate revision by one other user not shown) | |||
| Line 1: | Line 1: | ||
| − | + | {{Infobox tool | |
| − | |||
| − | {{ | ||
|purpose=Command line tools and libraries for handling and manipulating WARC files (and HTTP contents) | |purpose=Command line tools and libraries for handling and manipulating WARC files (and HTTP contents) | ||
|homepage=https://pypi.python.org/pypi/warctools/ | |homepage=https://pypi.python.org/pypi/warctools/ | ||
| − | |sourcecode=https://github.com/internetarchive/warctools/ | + | |sourcecode=https://github.com/internetarchive/warctools/ |
|license=MIT License | |license=MIT License | ||
|platforms=Cross-platform | |platforms=Cross-platform | ||
|language=Python | |language=Python | ||
| − | |formats_in= | + | |formats_in=WARC, ARC (Internet Archive) |
| − | |formats_out= | + | |formats_out=WARC |
| + | |function=File Format Migration, Metadata Extraction, Validation | ||
| + | |content=Web | ||
| + | }} | ||
| + | {{Infobox tool details | ||
| + | |ohloh_id=warctools | ||
}} | }} | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
== Description == | == Description == | ||
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | <!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | ||
| Line 47: | Line 41: | ||
=== Development === | === Development === | ||
<!-- Add the Ohloh.com ID for the tool, if known. --> | <!-- Add the Ohloh.com ID for the tool, if known. --> | ||
| − | + | ||
| − | |||
| − | |||
<rss max=5>https://github.com/internetarchive/warctools/commits/master.atom</rss> | <rss max=5>https://github.com/internetarchive/warctools/commits/master.atom</rss> | ||
Latest revision as of 21:33, 25 May 2021
Description
This is the most current and well-maintained Python codebase for working with WARC files. It provides a number of command-line tools for common WARC/ARC operations, and can also act as a library to create or work with WARC files directly from Python.
Pull requests and releases are currently managed by Thomas Figg, who can be contacted via Twitter.
Older Python WARC Implementations
This codebase was initially funded by IIPC and developed by Hanzo Archives. This lead to the hanzo-warc-tools package and source code.
There is also a separate warc package that was created by the Internet Archive (see source code), but is no longer in use.
Both of these projects are defunct and are now superseded by the internetarchive/warctools project.
User Experiences
Development Activity
Releases
- 2025-08-18 22:15:09
- [tag:github.com,2008:Repository/8960735/5.0.1 5.0.1]
- by mistydemeo
- 2025-05-30 17:17:16
- [tag:github.com,2008:Repository/8960735/5.0.0 5.0.0]
- by mistydemeo
- 2016-09-01 22:39:45
- [tag:github.com,2008:Repository/8960735/4.10.0 4.10.0]
- by nlevitt
- 2012-11-29 13:31:13
- [tag:github.com,2008:Repository/8960735/4.15-rc1 4.15-rc1]
- by lekash
- 2012-09-14 15:18:43
- [tag:github.com,2008:Repository/8960735/build_success-2012-09-14T16-25-56.483325901 build_success-2012-09-14T16-25-56.483325901]
- by SteveJones
Development
- 2025-08-18 22:15:09
- [tag:github.com,2008:Grit::Commit/dda34c5d3dc20aec0a01776480a3820ab1cc9de2 release: 5.0.1]
- by mistydemeo https://github.com/mistydemeo
- 2025-07-10 17:09:15
- [tag:github.com,2008:Grit::Commit/36d7e8fcff1e143289cd43c8eee597c6b1fc914e fix: typo in regex string]
- by mistydemeo https://github.com/mistydemeo
- 2025-05-30 18:01:33
- [tag:github.com,2008:Grit::Commit/21db132fd3e4b4042cd011d9dc3fb30276a5a0b6 config: migrate to pyproject.toml]
- by mistydemeo https://github.com/mistydemeo
- 2025-05-30 17:20:41
- [tag:github.com,2008:Grit::Commit/4c17416597117dd50205d3273fc3342a0c511353 gitignore: add dist]
- by mistydemeo https://github.com/mistydemeo
- 2025-05-30 17:17:16
- [tag:github.com,2008:Grit::Commit/df88bcd7ef880c64a729e57e554c65566459b711 release: 5.0.0]
- by mistydemeo https://github.com/mistydemeo