Search results

Jump to navigation Jump to search
  • ...ARC Manager is a web-based UI for managing and querying collections of web crawl data. |function=File Management, Web Capture
    936 bytes (137 words) - 16:57, 26 November 2021
  • Brozzler is a distributed web crawler that uses a real browser (Chrome or Chromium) to fetch pages and em Brozzler is designed to work in conjunction with warcprox for web archiving.
    2 KB (275 words) - 16:16, 9 December 2021
  • |purpose=w3act is an annotation and curation tool for web archives |content=Web
    873 bytes (130 words) - 16:11, 9 December 2021
  • |organisation=UK Government Web Archive ...several times until a satisfactory crawl is completed. If no satisfactory crawl can be made in this way, the site will be captured with Conifer.
    5 KB (748 words) - 17:02, 9 December 2021
  • |purpose=Heritrix is an open-source web crawler, allowing users to target websites they wish to include in a collec |function=Web Capture
    5 KB (753 words) - 15:59, 26 November 2021
  • ...e QA itself does not rely on a single technology, but require Web Archives crawl and replay tech.
    3 KB (427 words) - 20:20, 2 June 2023
  • |platforms=Web based |function=Web Capture
    926 bytes (133 words) - 16:55, 26 November 2021
  • |input=Web Archives visual replay and crawl report data |output=Adjustments to seed URLs and scopes; the results of a future crawl; documentation
    3 KB (441 words) - 19:21, 15 June 2023
  • |input=Visual curatorial assessments of web archives captures. |output=Jira tickets, QA on web archives, emails.
    3 KB (535 words) - 20:40, 2 June 2023
  • |purpose=WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objec |function=Persistent Identification, Web Capture, Citation and Impact Tracking
    3 KB (436 words) - 16:46, 26 November 2021
  • |organisation=The National Archives (UK), UK Government Web Archive [[File:TNA QA Process Flow v1 (1).png|UK Government Web Archive Quality Assurance (QA) Workflow]]<br>
    10 KB (1,809 words) - 11:35, 8 February 2024
  • ...Category:Web_Crawl]] is broader than just crawl. Could add an overarching "Web Archiving" category, then have sub categories. Would be nice to incorporate
    1 KB (202 words) - 09:15, 1 December 2014
  • |formats_out=HTTrack Crawl |function=Web Capture
    2 KB (299 words) - 15:57, 26 November 2021
  • |formats_in=HTTrack Crawl [[Category:Web]]
    2 KB (357 words) - 21:57, 25 May 2021