• Posted Dec. 10, 2013, 4:56 p.m. - 10 years, 12 months ago

How to Convert a PDF to a HTML Web Page

In the Pro version of Infix PDF Editor, you can take any PDF and export it in a HTML format, effectively converting it into a self contained web page. This is a handy feature if you want to quickly place content online without forcing people to download a PDF file.

Which HTML Type?

When exporting HTML from Infix PDF Editor, you can choose four different export formats:

●        Simple HTML

●        HTML 3

●        HTML 4 with a cascading style sheet (CSS)

●        HTML 4 without a cascading style sheet

Simple HTML will produce an output which should work in all web browsers, but images are not displayed inline on the page. The images are exported, but the page will appear as plain text with links to the images. The file is created with simple HTML tags, but you will lose a lot of the text formatting.

HTML 3 will export more styling and tags, and it supports inline images. This output type will also automatically include some metadata including author details and title information. HTML 3 is widely supported across most web browsers.

HTML 4 is suitable for viewing in slightly more modern browsers. This format makes use of CSS to format the text and layout of the web page. You can choose to output HTML 4 with the CSS if you’d like to be able to adjust the page styling easily. If you prefer, you can export as a flat file with no separate CSS output.

How to Export the PDF

Click File > Export > Pages As to start the export process. This displays the Export Pages Window which allows you to choose which part of the document you want to export.

Click Format to show the Export Format window. The export format window has four tabs which allow you to control how the PDF is exported.

The General tab allows you to define the export file type. There are a few file types available, but you need to tick Text Output and choose one of the HTML options in the drop down menu. Image Output will also export images in the PDF.

The Text tab is used to edit how the text flow and appearance is exported. The Try to preserve layout option here is particularly useful if we want the layout of the web page to be as close to the layout of the PDF as possible.

The Create a file for every page/article option here is also handy if you want your PDF to be split into multiple web pages, rather than fitting the entire content into a single web page.

The Image tab is used if you have ticked Image Output in the General tab.

The HTML tab controls the general appearance of the web page. This includes setting any headers or footers, the page background and navigation buttons.