Search results
Jump to navigation
Jump to search
The page 'Web Crawl' does not exist on this wiki. You can fix that!
- ...ARC Manager is a web-based UI for managing and querying collections of web crawl data. |function=File Management, Web Capture936 bytes (137 words) - 16:57, 26 November 2021
- Brozzler is a distributed web crawler that uses a real browser (Chrome or Chromium) to fetch pages and em Brozzler is designed to work in conjunction with warcprox for web archiving.2 KB (275 words) - 16:16, 9 December 2021
- |purpose=w3act is an annotation and curation tool for web archives |content=Web873 bytes (130 words) - 16:11, 9 December 2021
- |organisation=UK Government Web Archive ...several times until a satisfactory crawl is completed. If no satisfactory crawl can be made in this way, the site will be captured with Conifer.5 KB (748 words) - 17:02, 9 December 2021
- |purpose=Heritrix is an open-source web crawler, allowing users to target websites they wish to include in a collec |function=Web Capture5 KB (753 words) - 15:59, 26 November 2021
- ...e QA itself does not rely on a single technology, but require Web Archives crawl and replay tech.3 KB (427 words) - 20:20, 2 June 2023
- |platforms=Web based |function=Web Capture926 bytes (133 words) - 16:55, 26 November 2021
- |input=Web Archives visual replay and crawl report data |output=Adjustments to seed URLs and scopes; the results of a future crawl; documentation3 KB (441 words) - 19:21, 15 June 2023
- |input=Visual curatorial assessments of web archives captures. |output=Jira tickets, QA on web archives, emails.3 KB (535 words) - 20:40, 2 June 2023
- |purpose=WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objec |function=Persistent Identification, Web Capture, Citation and Impact Tracking3 KB (436 words) - 16:46, 26 November 2021
- |organisation=The National Archives (UK), UK Government Web Archive [[File:TNA QA Process Flow v1 (1).png|UK Government Web Archive Quality Assurance (QA) Workflow]]<br>10 KB (1,809 words) - 11:35, 8 February 2024
- ...Category:Web_Crawl]] is broader than just crawl. Could add an overarching "Web Archiving" category, then have sub categories. Would be nice to incorporate1 KB (202 words) - 09:15, 1 December 2014
- |formats_out=HTTrack Crawl |function=Web Capture2 KB (299 words) - 15:57, 26 November 2021
- |formats_in=HTTrack Crawl [[Category:Web]]2 KB (357 words) - 21:57, 25 May 2021