Editing Pagelyzer
Jump to navigation
Jump to search
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
− | {{ | + | === Summary === |
− | + | {{Infobox_tool | |
|purpose=Suite of tools for detecting changes in web pages and their rendering | |purpose=Suite of tools for detecting changes in web pages and their rendering | ||
− | | | + | |image=http://www.scape-project.eu/wp-content/uploads/2013/06/pagelyzer_small.png |
− | |||
− | |||
}} | }} | ||
− | {{ | + | |
− | = Description = | + | |
+ | '''Purpose:''' {excerpt}<br /> | ||
+ | {excerpt}<br /> | ||
+ | '''Homepage:''' [[https://github.com/openplanets/pagelyzer https://github.com/openplanets/pagelyzer]]<br /> | ||
+ | '''Source Code:''' [[https://github.com/openplanets/pagelyzer https://github.com/openplanets/pagelyzer]]<br /> | ||
+ | '''License:''' None<br /> | ||
+ | '''Cost:''' Free<br /> | ||
+ | '''Platform:''' Unix<br /> | ||
+ | [[Image:http://www.webaddress.logo/Insert_Logo_URL_Here|http://www.webaddress.logo/Insert_Logo_URL_Here]] | ||
+ | |||
+ | [[Category:Characterisation]] | ||
+ | |||
+ | === Description === | ||
+ | |||
Pagelyzer is a tool which compares two web pages versions and decides if they are similar or not. | Pagelyzer is a tool which compares two web pages versions and decides if they are similar or not. | ||
Line 19: | Line 30: | ||
Installation manual can be found [http://wiki.opf-labs.org/download/attachments/12059037/installation+manual-1-pagealyzer.pdf?version=4&modificationDate=1354896471000 here] | Installation manual can be found [http://wiki.opf-labs.org/download/attachments/12059037/installation+manual-1-pagealyzer.pdf?version=4&modificationDate=1354896471000 here] | ||
− | == How does it work? == | + | === How does it work? === |
Step 1: For each url given as inputs, it gets screen capture in PNG format and also produces an HTML document with the visual cues integrated, called Decorated HTML. This allows to save the state of a browser at the moment of capture and permits to decouple the solution from a particular browser. | Step 1: For each url given as inputs, it gets screen capture in PNG format and also produces an HTML document with the visual cues integrated, called Decorated HTML. This allows to save the state of a browser at the moment of capture and permits to decouple the solution from a particular browser. | ||
Line 29: | Line 40: | ||
== References == | == References == | ||
− | + | D. Cai, S. Yu, J.R. Wen, and W.Y. Ma. VIPS: a Vision-based Page Segmentation Algorithm. Technical report, Microsoft Research, 2003. | |
− | + | ||
− | + | &nbsp;Saad M.B., Gançarski S., Pehlivan Z.. A Novel Web Archiving Approach based on Visual Pages Analysis. In 9th International Web Archiving Workshop (IWAW), ECDL 2009 | |
− | + | ||
− | + | Sanoja, Gançarski S. “Yet another Web Page Segmentation Tool”. Proceedings iPRES 2012. Toronto. Canada, 2012 | |
− | + | ||
+ | D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60, 2004 | ||
+ | |||
+ | Pehlivan Z., Saad M.B. , Gançarski S..Understanding Web Pages Changes. ''DEXA 2010: 1-15'' | ||
+ | |||
+ | M. Teva Law, C. Sureda, N. Thome, S. Gançarski, M. Cord. Structural and Visual Similarity Learning for Web Page Archiving, Workshop CBMI 2012 | ||
− | = User Experiences = | + | === User Experiences and Test Data === |
− | + | [SP:SO18 Comparing two web page versions for web archiving] | |
− | = Development Activity = | + | === Development Activity === |
− | + | https://github.com/openplanets/pagelyzer/commits/master.atom | |
− | |||
− | |||
− |