Edit Tool: Pdfcpu

To edit this page, please answer the question that appears below (more info):

What short name does OAIS use for an information package that is used for dissemination?

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

Image: In another tab, use the "Upload file" link to upload a logo image. Paste the name of the image here:
Purpose* A single sentence describing the function of the tool. Keep it factual and concise:	A Go library and command line tool for PDF processing incl. validation
Homepage The URL of the homepage:
Sourcecode The URL for the source code:
License:
Cost:
Platforms:
Language:
Wikidata ID:
Input formats Start typing and select from the available list. Only add a new format if necessary:
Output Formats: Start typing and select from the available list. Only add a new format if necessary:
Function*: Start typing and select one or more functions from the available list. Only add a new function if absolutely necessary:
Content: Start typing and select one or more content types from the available list. Only add a new content type if absolutely necessary:

OpenHub ID:
Mailing_lists:
Releases_rss:
Issues_rss:

Free text:

== Description ==  pdfcpu is a powerful Go library and command line tool that supports many PDF processing functions such as reading and writing the xref table, extracting images, fonts and embedded file attachments and encryption detection as well as decryption. A full list of the command set is avaiable at: https://pdfcpu.io/about/command_set === Validation === pdfcpu validates PDFs up to version 1.7. There are two different levels of validation, strict and relaxed, and the option to run validation in verbose mode which shows the output of the entire PDF syntax. pdfcpu validate -mode strict this.pdf validating(mode=strict) this.pdf ... validation error (try -mode=relaxed): dict=type1FontDict required entry=FirstChar missing pdfcpu validate -mode relaxed this.pdf validating(mode=relaxed) this.pdf ... validation ok === Metadata Extraction === Extraction of Metadata is possible in two ways. The argument ''info'' prints Information such as title, author, PDF producer, creation / modification data and some technical metadata such as encryption and permission information and whether the PDF is tagged, linearized or includes watermarks. Sample output: pdfcpu info this.pdf PDF version: 1.6 Page count: 96 Page size: 21.00 x 29.70 cm ......................................... Title: This is just a test Author: Digiman Subject: PDF Producer: Adobe PDF Library 15.0 Content creator: Adobe InDesign CC 207 (Macintosh) Creation date: D:20190912181416+02'00' Modification date: D:20190918120753+02'00' Keywords: key1 key2 .......................................... Tagged: Yes Hybrid: No Linearized: No Using XRef streams: Yes Using object streams: Yes Watermarks: No .......................................... Encrypted: No Permissions: Full access The second option is to extract any embedded metadata via the extract -mode meta flags. This creates txt files with the extracted metadata entries in a specified directory. Sample output: pdfcpu extract -mode meta this.pdf mdout extracting metadata from this.pdf into mdout/ ... writing mdout\this_Metadata_XObject_6499_6500.txt writing mdout\this_Metadata_unknown_401_33.txt writing mdout\this_Metadata_XObject_292_289.txt writing mdout\this_Metadata_Catalog_6455_385.txt writing mdout\this_Metadata_XObject_291_290.txt writing mdout\this_Metadata_unknown_6491_6475.txt == User Experiences ==  == Development Activity ==  pdfcpu has an active user and developer community. All activity can be viewed via the github repo: [https://github.com/pdfcpu/pdfcpu https://github.com/pdfcpu/pdfcpu]

Summary:

Cancel

Edit Tool: Pdfcpu

Navigation menu

Search