Difference between revisions of "Web Archive Discovery"
Jump to navigation
Jump to search
(Added activity feeds.) |
|||
Line 17: | Line 17: | ||
<!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --> | <!-- Add relevant categories to describe the content type that the tool addresses. Choose carefully, and view the list of existing categories first (see the Navigation sidebar on the left). If the tool works on any content type, do not add a category. The following are common category examples, remove those that don't apply --> | ||
[[Category:Web]] | [[Category:Web]] | ||
+ | [[Category:Web_Archive]] | ||
== Description == | == Description == | ||
<!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | <!-- Describe the what the tool does, focusing on it's digital preservation value. Keep it factual. --> | ||
− | Full-text indexing system, using Apache Solr as the search back-end. Supports command-line | + | Full-text indexing system, using Apache Solr as the search back-end. Supports command-line or large-scale map-reduce (Hadoop) processing of ARC and WARC files. Also integrates file format analysis and scans for some known preservation risks. |
== User Experiences == | == User Experiences == | ||
Line 26: | Line 27: | ||
* Used by the [http://www.webarchive.org.uk/ UK Web Archive] to provide access to their collections. More details TBA. | * Used by the [http://www.webarchive.org.uk/ UK Web Archive] to provide access to their collections. More details TBA. | ||
− | + | = Development Activity = | |
<!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --> | <!-- Provide *evidence* of development activity of the tool. For example, RSS feeds for code issues or commits. --> | ||
+ | All development activity is visible on GitHub: http://github.com/ukwa/webarchive-discovery | ||
+ | |||
+ | There is also a #webarchive-discovery channel on the IIPC Slack service. Contact https://twitter.com/NetPreserve for details. | ||
+ | |||
+ | === Release Feed === | ||
+ | Below the last 3 release feeds: | ||
+ | <rss max=3>https://github.com/ukwa/webarchive-discovery/releases.atom</rss> | ||
+ | |||
+ | |||
+ | === Activity Feed === | ||
+ | Below the last 5 commits: | ||
+ | <rss max=5>https://github.com/ukwa/webarchive-discovery/commits/master.atom</rss> | ||
+ | |||
− | |||
{{Infobox_tool_details | {{Infobox_tool_details | ||
− | |ohloh_id= | + | |releases_rss= |
+ | |issues_rss= | ||
+ | |mailing_lists= | ||
+ | |ohloh_id=Heritrix | ||
}} | }} |
Revision as of 14:16, 24 September 2018
Description
Full-text indexing system, using Apache Solr as the search back-end. Supports command-line or large-scale map-reduce (Hadoop) processing of ARC and WARC files. Also integrates file format analysis and scans for some known preservation risks.
User Experiences
- Used by the UK Web Archive to provide access to their collections. More details TBA.
Development Activity
All development activity is visible on GitHub: http://github.com/ukwa/webarchive-discovery
There is also a #webarchive-discovery channel on the IIPC Slack service. Contact https://twitter.com/NetPreserve for details.
Release Feed
Below the last 3 release feeds:
- 2024-04-02 09:25:58
- [tag:github.com,2008:Repository/7257232/warc-discovery-3.3.1 Revert of source_file_path]
- by GilHoggarth
- 2023-06-02 11:04:22
- [tag:github.com,2008:Repository/7257232/warc-discovery-3.3.0 warc-discovery-3.3.0]
- by anjackson
- 2020-11-27 12:25:29
- [tag:github.com,2008:Repository/7257232/warc-discovery-3.1.0 warc-discovery-3.1.0]
- by anjackson
Activity Feed
Below the last 5 commits:
- 2025-03-11 12:48:48
- [tag:github.com,2008:Grit::Commit/4898ed804b3edaa3bdff84f46b2d1d3b71325660 Merge pull request #320 from bnfleb/issue-319]
- by GilHoggarth https://github.com/GilHoggarth
- 2025-03-06 17:20:01
- [tag:github.com,2008:Grit::Commit/f6750aa0bdf608d3137ec64818135f763b06d316 Remove default value for disable-commit CL parameter for issue #319]
- by bnfleb https://github.com/bnfleb
- 2024-08-09 10:57:54
- [tag:github.com,2008:Grit::Commit/40ce1635f79b8d9d13f3fa2a1577f0ca46aa8404 Merge pull request #318 from lasztoth/langid-language-analyser]
- by GilHoggarth https://github.com/GilHoggarth
- 2024-08-09 10:36:40
- [tag:github.com,2008:Grit::Commit/380afa66e0d45e569f0dd2971c1a8039daa90402 Added correct version of artifact]
- by KGX747@MC212515.gouv.etat.lu
- 2024-08-09 10:08:55
- [tag:github.com,2008:Grit::Commit/170c8dfb3543159065af792cf226e2ea1726c852 Update LanguageAnalyser.java]
- by GilHoggarth https://github.com/GilHoggarth
Error in widget Ohloh Project: unable to write file /var/www/html/extensions/Widgets/compiled_templates/wrt67fc00ea550680_55177571