README_WKHTMLTOPDF 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509
  1. =======================> wkhtmltopdf 0.10.0 rc2 Manual <========================
  2. This file documents wkhtmltopdf, a program capable of converting html documents
  3. into PDF documents.
  4. ==================================> Contact <===================================
  5. If you experience bugs or want to request new features please visit
  6. <http://code.google.com/p/wkhtmltopdf/issues/list>, if you have any problems or
  7. comments please feel free to contact me: see
  8. <http://www.madalgo.au.dk/~jakobt/#about>
  9. ===========================> Reduced Functionality <============================
  10. Some versions of wkhtmltopdf are compiled against a version of QT without the
  11. wkhtmltopdf patches. These versions are missing some features, you can find out
  12. if your version of wkhtmltopdf is one of these by running wkhtmltopdf --version
  13. if your version is against an unpatched QT, you can use the static version to
  14. get all functionality.
  15. Currently the list of features only supported with patch QT includes:
  16. * Printing more then one HTML document into a PDF file.
  17. * Running without an X11 server.
  18. * Adding a document outline to the PDF file.
  19. * Adding headers and footers to the PDF file.
  20. * Generating a table of contents.
  21. * Adding links in the generated PDF file.
  22. * Printing using the screen media-type.
  23. * Disabling the smart shrink feature of webkit.
  24. ==================================> License <===================================
  25. Copyright (C) 2010 wkhtmltopdf/wkhtmltoimage Authors.
  26. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
  27. This is free software: you are free to change and redistribute it. There is NO
  28. WARRANTY, to the extent permitted by law.
  29. ==================================> Authors <===================================
  30. Written by Jan Habermann, Christian Sciberras and Jakob Truelsen. Patches by
  31. Mehdi Abbad, Lyes Amazouz, Pascal Bach, Emmanuel Bouthenot, Benoit Garret and
  32. Mário Silva.
  33. ==================================> Synopsis <==================================
  34. wkhtmltopdf [GLOBAL OPTION]... [OBJECT]... <output file>
  35. ==============================> Document objects <==============================
  36. wkhtmltopdf is able to put several objects into the output file, an object is
  37. either a single webpage, a cover webpage or a table of content. The objects are
  38. put into the output document in the order they are specified on the command
  39. line, options can be specified on a per object basis or in the global options
  40. area. Options from the Global Options section can only be placed in the global
  41. options area
  42. A page objects puts the content of a singe webpage into the output document.
  43. (page)? <input url/file name> [PAGE OPTION]...
  44. Options for the page object can be placed in the global options and the page
  45. options areas. The applicable options can be found in the Page Options and
  46. Headers And Footer Options sections.
  47. A cover objects puts the content of a singe webpage into the output document,
  48. the page does not appear in the table of content, and does not have headers and
  49. footers.
  50. cover <input url/file name> [PAGE OPTION]...
  51. All options that can be specified for a page object can also be specified for a
  52. cover.
  53. A table of content object inserts a table of content into the output document.
  54. toc [TOC OPTION]...
  55. All options that can be specified for a page object can also be specified for a
  56. toc, further more the options from the TOC Options section can also be applied.
  57. The table of content is generated via XSLT which means that it can be styled to
  58. look however you want it to look. To get an aide of how to do this you can dump
  59. the default xslt document by supplying the --dump-default-toc-xsl, and the
  60. outline it works on by supplying --dump-outline, see the Outline Options
  61. section.
  62. ===============================> Global Options <===============================
  63. --collate Collate when printing multiple copies
  64. (default)
  65. --no-collate Do not collate when printing multiple
  66. copies
  67. --cookie-jar <path> Read and write cookies from and to the
  68. supplied cookie jar file
  69. --copies <number> Number of copies to print into the pdf
  70. file (default 1)
  71. -d, --dpi <dpi> Change the dpi explicitly (this has no
  72. effect on X11 based systems)
  73. -H, --extended-help Display more extensive help, detailing
  74. less common command switches
  75. -g, --grayscale PDF will be generated in grayscale
  76. -h, --help Display help
  77. --htmldoc Output program html help
  78. --image-dpi * <integer> When embedding images scale them down to
  79. this dpi (default 600)
  80. --image-quality * <integer> When jpeg compressing images use this
  81. quality (default 94)
  82. -l, --lowquality Generates lower quality pdf/ps. Useful to
  83. shrink the result document space
  84. --manpage Output program man page
  85. -B, --margin-bottom <unitreal> Set the page bottom margin (default 10mm)
  86. -L, --margin-left <unitreal> Set the page left margin (default 10mm)
  87. -R, --margin-right <unitreal> Set the page right margin (default 10mm)
  88. -T, --margin-top <unitreal> Set the page top margin (default 10mm)
  89. -O, --orientation <orientation> Set orientation to Landscape or Portrait
  90. (default Portrait)
  91. --output-format <format> Specify an output format to use pdf or ps,
  92. instead of looking at the extention of the
  93. output filename
  94. --page-height <unitreal> Page height
  95. -s, --page-size <Size> Set paper size to: A4, Letter, etc.
  96. (default A4)
  97. --page-width <unitreal> Page width
  98. --no-pdf-compression * Do not use lossless compression on pdf
  99. objects
  100. -q, --quiet Be less verbose
  101. --read-args-from-stdin Read command line arguments from stdin
  102. --readme Output program readme
  103. --title <text> The title of the generated pdf file (The
  104. title of the first document is used if not
  105. specified)
  106. --use-xserver * Use the X server (some plugins and other
  107. stuff might not work without X11)
  108. -V, --version Output version information an exit
  109. Items marked * are only available using patched QT.
  110. ==============================> Outline Options <===============================
  111. --dump-default-toc-xsl * Dump the default TOC xsl style sheet to
  112. stdout
  113. --dump-outline * <file> Dump the outline to a file
  114. --outline * Put an outline into the pdf (default)
  115. --no-outline * Do not put an outline into the pdf
  116. --outline-depth * <level> Set the depth of the outline (default 4)
  117. Items marked * are only available using patched QT.
  118. ================================> Page Options <================================
  119. --allow <path> Allow the file or files from the specified
  120. folder to be loaded (repeatable)
  121. --background Do print background (default)
  122. --no-background Do not print background
  123. --checkbox-checked-svg <path> Use this SVG file when rendering checked
  124. checkboxes
  125. --checkbox-svg <path> Use this SVG file when rendering unchecked
  126. checkboxes
  127. --cookie <name> <value> Set an additional cookie (repeatable)
  128. --custom-header <name> <value> Set an additional HTTP header (repeatable)
  129. --custom-header-propagation Add HTTP headers specified by
  130. --custom-header for each resource request.
  131. --no-custom-header-propagation Do not add HTTP headers specified by
  132. --custom-header for each resource request.
  133. --debug-javascript Show javascript debugging output
  134. --no-debug-javascript Do not show javascript debugging output
  135. (default)
  136. --default-header * Add a default header, with the name of the
  137. page to the left, and the page number to
  138. the right, this is short for:
  139. --header-left='[webpage]'
  140. --header-right='[page]/[toPage]' --top 2cm
  141. --header-line
  142. --encoding <encoding> Set the default text encoding, for input
  143. --disable-external-links * Do not make links to remote web pages
  144. --enable-external-links * Make links to remote web pages (default)
  145. --disable-forms * Do not turn HTML form fields into pdf form
  146. fields (default)
  147. --enable-forms * Turn HTML form fields into pdf form fields
  148. --images Do load or print images (default)
  149. --no-images Do not load or print images
  150. --disable-internal-links * Do not make local links
  151. --enable-internal-links * Make local links (default)
  152. -n, --disable-javascript Do not allow web pages to run javascript
  153. --enable-javascript Do allow web pages to run javascript
  154. (default)
  155. --javascript-delay <msec> Wait some milliseconds for javascript
  156. finish (default 200)
  157. --load-error-handling <handler> Specify how to handle pages that fail to
  158. load: abort, ignore or skip (default
  159. abort)
  160. --disable-local-file-access Do not allowed conversion of a local file
  161. to read in other local files, unless
  162. explecitily allowed with --allow
  163. --enable-local-file-access Allowed conversion of a local file to read
  164. in other local files. (default)
  165. --minimum-font-size <int> Minimum font size
  166. --exclude-from-outline * Do not include the page in the table of
  167. contents and outlines
  168. --include-in-outline * Include the page in the table of contents
  169. and outlines (default)
  170. --page-offset <offset> Set the starting page number (default 0)
  171. --password <password> HTTP Authentication password
  172. --disable-plugins Disable installed plugins (default)
  173. --enable-plugins Enable installed plugins (plugins will
  174. likely not work)
  175. --post <name> <value> Add an additional post field (repeatable)
  176. --post-file <name> <path> Post an additional file (repeatable)
  177. --print-media-type * Use print media-type instead of screen
  178. --no-print-media-type * Do not use print media-type instead of
  179. screen (default)
  180. -p, --proxy <proxy> Use a proxy
  181. --radiobutton-checked-svg <path> Use this SVG file when rendering checked
  182. radiobuttons
  183. --radiobutton-svg <path> Use this SVG file when rendering unchecked
  184. radiobuttons
  185. --run-script <js> Run this additional javascript after the
  186. page is done loading (repeatable)
  187. --disable-smart-shrinking * Disable the intelligent shrinking strategy
  188. used by WebKit that makes the pixel/dpi
  189. ratio none constant
  190. --enable-smart-shrinking * Enable the intelligent shrinking strategy
  191. used by WebKit that makes the pixel/dpi
  192. ratio none constant (default)
  193. --stop-slow-scripts Stop slow running javascripts (default)
  194. --no-stop-slow-scripts Do not Stop slow running javascripts
  195. (default)
  196. --disable-toc-back-links * Do not link from section header to toc
  197. (default)
  198. --enable-toc-back-links * Link from section header to toc
  199. --user-style-sheet <url> Specify a user style sheet, to load with
  200. every page
  201. --username <username> HTTP Authentication username
  202. --window-status <windowStatus> Wait until window.status is equal to this
  203. string before rendering page
  204. --zoom <float> Use this zoom factor (default 1)
  205. Items marked * are only available using patched QT.
  206. =========================> Headers And Footer Options <=========================
  207. --footer-center * <text> Centered footer text
  208. --footer-font-name * <name> Set footer font name (default Arial)
  209. --footer-font-size * <size> Set footer font size (default 12)
  210. --footer-html * <url> Adds a html footer
  211. --footer-left * <text> Left aligned footer text
  212. --footer-line * Display line above the footer
  213. --no-footer-line * Do not display line above the footer
  214. (default)
  215. --footer-right * <text> Right aligned footer text
  216. --footer-spacing * <real> Spacing between footer and content in mm
  217. (default 0)
  218. --header-center * <text> Centered header text
  219. --header-font-name * <name> Set header font name (default Arial)
  220. --header-font-size * <size> Set header font size (default 12)
  221. --header-html * <url> Adds a html header
  222. --header-left * <text> Left aligned header text
  223. --header-line * Display line below the header
  224. --no-header-line * Do not display line below the header
  225. (default)
  226. --header-right * <text> Right aligned header text
  227. --header-spacing * <real> Spacing between header and content in mm
  228. (default 0)
  229. --replace * <name> <value> Replace [name] with value in header and
  230. footer (repeatable)
  231. Items marked * are only available using patched QT.
  232. ================================> TOC Options <=================================
  233. --disable-dotted-lines * Do not use dottet lines in the toc
  234. --toc-header-text * <text> The header text of the toc (default Table
  235. of Content)
  236. --toc-level-indentation * <width> For each level of headings in the toc
  237. indent by this length (default 1em)
  238. --disable-toc-links * Do not link from toc to sections
  239. --toc-text-size-shrink * <real> For each level of headings in the toc the
  240. font is scaled by this facter (default
  241. 0.8)
  242. --xsl-style-sheet * <file> Use the supplied xsl style sheet for
  243. printing the table of content
  244. Items marked * are only available using patched QT.
  245. =============================> Specifying A Proxy <=============================
  246. By default proxy information will be read from the environment variables: proxy,
  247. all_proxy and http_proxy, proxy options can also by specified with the -p switch
  248. <type> := "http://" | "socks5://"
  249. <serif> := <username> (":" <password>)? "@"
  250. <proxy> := "None" | <type>? <sering>? <host> (":" <port>)?
  251. Here are some examples (In case you are unfamiliar with the BNF):
  252. http://user:password@myproxyserver:8080
  253. socks5://myproxyserver
  254. None
  255. ============================> Footers And Headers <=============================
  256. Headers and footers can be added to the document by the --header-* and --footer*
  257. arguments respectfully. In header and footer text string supplied to e.g.
  258. --header-left, the following variables will be substituted.
  259. * [page] Replaced by the number of the pages currently being printed
  260. * [frompage] Replaced by the number of the first page to be printed
  261. * [topage] Replaced by the number of the last page to be printed
  262. * [webpage] Replaced by the URL of the page being printed
  263. * [section] Replaced by the name of the current section
  264. * [subsection] Replaced by the name of the current subsection
  265. * [date] Replaced by the current date in system local format
  266. * [time] Replaced by the current time in system local format
  267. * [title] Replaced by the title of the of the current page object
  268. * [doctitle] Replaced by the title of the output document
  269. As an example specifying --header-right "Page [page] of [toPage]", will result
  270. in the text "Page x of y" where x is the number of the current page and y is the
  271. number of the last page, to appear in the upper left corner in the document.
  272. Headers and footers can also be supplied with HTML documents. As an example one
  273. could specify --header-html header.html, and use the following content in
  274. header.html:
  275. <html><head><script>
  276. function subst() {
  277. var vars={};
  278. var x=document.location.search.substring(1).split('&');
  279. for (var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
  280. var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
  281. for (var i in x) {
  282. var y = document.getElementsByClassName(x[i]);
  283. for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
  284. }
  285. }
  286. </script></head><body style="border:0; margin: 0;" onload="subst()">
  287. <table style="border-bottom: 1px solid black; width: 100%">
  288. <tr>
  289. <td class="section"></td>
  290. <td style="text-align:right">
  291. Page <span class="page"></span> of <span class="topage"></span>
  292. </td>
  293. </tr>
  294. </table>
  295. </body></html>
  296. As can be seen from the example, the arguments are sent to the header/footer
  297. html documents in get fashion.
  298. ==================================> Outlines <==================================
  299. Wkhtmltopdf with patched qt has support for PDF outlines also known as book
  300. marks, this can be enabled by specifying the --outline switch. The outlines are
  301. generated based on the <h?> tags, for a in-depth description of how this is done
  302. see the Table Of Contest section.
  303. The outline tree can sometimes be very deep, if the <h?> tags where spread to
  304. generous in the HTML document. The --outline-depth switch can be used to bound
  305. this.
  306. ==============================> Table Of Content <==============================
  307. A table of content can be added to the document by adding a toc objectto the
  308. command line. For example:
  309. wkhtmltopdf toc http://doc.trolltech.com/4.6/qstring.html qstring.pdf
  310. The table of content is generated based on the H tags in the input documents.
  311. First a XML document is generated, then it is converted to HTML using XSLT.
  312. The generated XML document can be viewed by dumping it to a file using the
  313. --dump-outline switch. For example:
  314. wkhtmltopdf --dump-outline toc.xml http://doc.trolltech.com/4.6/qstring.html qstring.pdf
  315. The XSLT document can be specified using the --xsl-style-sheet switch. For
  316. example:
  317. wkhtmltopdf toc --xsl-style-sheet my.xsl http://doc.trolltech.com/4.6/qstring.html qstring.pdf
  318. The --dump-default-toc-xsl switch can be used to dump the default XSLT style
  319. sheet to stdout. This is a good start for writing your own style sheet
  320. wkhtmltopdf --dump-default-toc-xsl
  321. The XML document is in the namespace
  322. "http://code.google.com/p/wkhtmltopdf/outline" it has a root node called
  323. "outline" which contains a number of "item" nodes. An item can contain any
  324. number of item. These are the outline subsections to the section the item
  325. represents. A item node has the following attributes:
  326. * "title" the name of the section.
  327. * "page" the page number the section occurs on.
  328. * "link" a URL that links to the section.
  329. * "backLink" the name of the anchor the the section will link back to.
  330. The remaining TOC options only affect the default style sheet so they will not
  331. work when specifying a custom style sheet.
  332. ===============================> Page Breaking <================================
  333. The current page breaking algorithm of WebKit leaves much to be desired.
  334. Basically webkit will render everything into one long page, and then cut it up
  335. into pages. This means that if you have two columns of text where one is
  336. vertically shifted by half a line. Then webkit will cut a line into to pieces
  337. display the top half on one page. And the bottom half on another page. It will
  338. also break image in two and so on. If you are using the patched version of QT
  339. you can use the CSS page-break-inside property to remedy this somewhat. There is
  340. no easy solution to this problem, until this is solved try organising your HTML
  341. documents such that it contains many lines on which pages can be cut cleanly.
  342. See also: <http://code.google.com/p/wkhtmltopdf/issues/detail?id=9>,
  343. <http://code.google.com/p/wkhtmltopdf/issues/detail?id=33> and
  344. <http://code.google.com/p/wkhtmltopdf/issues/detail?id=57>.
  345. =================================> Page sizes <=================================
  346. The default page size of the rendered document is A4, but using this --page-size
  347. optionthis can be changed to almost anything else, such as: A3, Letter and
  348. Legal. For a full list of supported pages sizes please see
  349. <http://doc.trolltech.com/4.6/qprinter.html#PageSize-enum>.
  350. For a more fine grained control over the page size the --page-height and
  351. --page-width options may be used
  352. ========================> Reading arguments from stdin <========================
  353. If you need to convert a lot of pages in a batch, and you feel that wkhtmltopdf
  354. is a bit to slow to start up, then you should try --read-args-from-stdin,
  355. When --read-args-from-stdin each line of input sent to wkhtmltopdf on stdin will
  356. act as a separate invocation of wkhtmltopdf, with the arguments specified on the
  357. given line combined with the arguments given to wkhtmltopdf
  358. For example one could do the following:
  359. echo "http://doc.trolltech.com/4.5/qapplication.html qapplication.pdf" >> cmds
  360. echo "cover google.com http://en.wikipedia.org/wiki/Qt_(toolkit) qt.pdf" >> cmds
  361. wkhtmltopdf --read-args-from-stdin --book < cmds
  362. ===============================> Static version <===============================
  363. On the wkhtmltopdf website you can download a static version of wkhtmltopdf
  364. <http://code.google.com/p/wkhtmltopdf/downloads/list>. This static binary will
  365. work on most systems and comes with a build in patched QT.
  366. Unfortunately the static binary is not particularly static, on Linux it depends
  367. on both glibc and openssl, furthermore you will need to have an xserver
  368. installed but not necessary running. You will need to have different fonts
  369. install including xfonts-scalable (Type1), and msttcorefonts. See
  370. <http://code.google.com/p/wkhtmltopdf/wiki/static> for trouble shouting.
  371. ================================> Compilation <=================================
  372. It can happen that the static binary does not work for your system for one
  373. reason or the other, in that case you might need to compile wkhtmltopdf
  374. yourself.
  375. *GNU/Linux:*
  376. Before compilation you will need to install dependencies: X11, gcc, git and
  377. openssl. On Debian/Ubuntu this can be done as follows:
  378. sudo apt-get build-dep libqt4-gui libqt4-network libqt4-webkit
  379. sudo apt-get install openssl build-essential xorg git-core git-doc libssl-dev
  380. On other systems you must use your own package manager, the packages might be
  381. named differently.
  382. First you must check out the modified version of QT
  383. git clone git://gitorious.org/+wkhtml2pdf/qt/wkhtmltopdf-qt.git wkhtmltopdf-qt
  384. Next you must configure, compile and install QT, note this will take quite some
  385. time, depending on what arguments you use to configure qt
  386. cd wkhtmltopdf-qt
  387. ./configure -nomake tools,examples,demos,docs,translations -opensource -prefix ../wkqt
  388. make -j3
  389. make install
  390. cd ..
  391. All that is needed now is, to compile wkhtmltopdf.
  392. git clone git://github.com/antialize/wkhtmltopdf.git wkhtmltopdf
  393. cd wkhtmltopdf
  394. ../wkqt/bin/qmake
  395. make -j3
  396. You show now have a binary called wkhtmltopdf in the currently folder that you
  397. can use, you can optionally install it by running
  398. make install
  399. *Other operative systems and advanced features*
  400. If you want more details or want to compile under other operative systemsother
  401. then GNU/Linux, please see
  402. <http://code.google.com/p/wkhtmltopdf/wiki/compilation>.
  403. ================================> Installation <================================
  404. There are several ways to install wkhtmltopdf. You can download a already
  405. compiled binary, or you can compile wkhtmltopdf yourself. On windows the easiest
  406. way to install wkhtmltopdf is to download the latest installer. On linux you can
  407. download the latest static binary, however you still need to install some other
  408. pieces of software, to learn more about this read the static version section of
  409. the manual.
  410. ==================================> Examples <==================================
  411. This section presents a number of examples of how to invoke wkhtmltopdf.
  412. To convert a remote HTML file to PDF:
  413. wkhtmltopdf http://www.google.com google.pdf
  414. To convert a local HTML file to PDF:
  415. wkhtmltopdf my.html my.pdf
  416. You can also convert to PS files if you like:
  417. wkhtmltopdf my.html my.ps
  418. Produce the eler2.pdf sample file:
  419. wkhtmltopdf -H http://geekz.co.uk/lovesraymond/archive/eler-highlights-2008 eler2.pdf
  420. Printing a book with a table of content:
  421. wkhtmltopdf -H cover cover.html toc chapter1.html chapter2.html chapter3.html book.pdf