Techno Blender
Digitally Yours.

Convert HTML to PDF using Python. In this tutorial we will explore how to… | by Misha Sv | Jun, 2022

0 81


In this tutorial we will explore how to convert HTML files to PDF using Python

Photo by Florian Olivo on Unsplash

Table of Contents

  • Introduction
  • Sample HTML file
  • Convert HTML file to PDF using Python
  • Convert Webpage to PDF using Python
  • Conclusion

There are several online tools that allow you to convert HTML files and webpages to PDF, and most of them are free.

While it is a simple process, being able to automate it can be very useful for some HTML code testing as well as saving required webpages as PDF files.

To continue following this tutorial we will need:

wkhtmltopdf is an open source command line tool to render HTML files into PDF using the Qt WebKit rendering engine.

In order to use it in Python, we will also need the pdfkit library which is a wrapper for wkhtmltopdf utility.

First, search for the wkhtmltopdf installer for your operating system. For Windows, you can find the latest version of wkhtmltopdf installer here. Simply download the .exe file and install on your computer.

Remember the path to the directory where it will be installed.
In my case it is: C:\Program Files\wkhtmltopdf

If you don’t have the Python library installed, please open “Command Prompt” (on Windows) and install it using the following code:

pip install pdfkit

In order to continue in this tutorial we will need some HTML file to work with.

Here is a sample HTML file we will use in this tutorial:

https://pyshark.com/wp-content/uploads/2022/06/sample.html

If you download it and open in your browser, you should see:

and opening it in the code editor should show:

Let’s start with converting HTML file to PDF using Python.

The sample.html file is located in the same directory as the main.py file with the code:

First, we will need to find the path to the wkhtmltopdf executable file wkhtmltopdf.exe

Recall that we installed in C:\Program Files\wkhtmltopdf meaning that the .exe file is in that folder. Navigating to it, you should see that the path to executable file is: C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe

Now we have everything we need and can easily convert HTML file to PDF using Python:

And you should see sample.pdf created in the same directory:

Image by Author

which should should look like this.

Using pdfkit library you can also convert webpages into PDF using Python.

Let’s convert the wkhtmltopdf project page to PDF!

In this section we will reuse most of the code from the previous section, except now instead of using HTML file we will use the URL of a webpage and the .from_url() method of pdfkit class:

And you should see webpage.pdf created in the same directory:

Image by Author

which should should look like this.

In this article we explored how to convert HTML to PDF using Python and wkhtmltopdf.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming tutorials.


In this tutorial we will explore how to convert HTML files to PDF using Python

Photo by Florian Olivo on Unsplash

Table of Contents

  • Introduction
  • Sample HTML file
  • Convert HTML file to PDF using Python
  • Convert Webpage to PDF using Python
  • Conclusion

There are several online tools that allow you to convert HTML files and webpages to PDF, and most of them are free.

While it is a simple process, being able to automate it can be very useful for some HTML code testing as well as saving required webpages as PDF files.

To continue following this tutorial we will need:

wkhtmltopdf is an open source command line tool to render HTML files into PDF using the Qt WebKit rendering engine.

In order to use it in Python, we will also need the pdfkit library which is a wrapper for wkhtmltopdf utility.

First, search for the wkhtmltopdf installer for your operating system. For Windows, you can find the latest version of wkhtmltopdf installer here. Simply download the .exe file and install on your computer.

Remember the path to the directory where it will be installed.
In my case it is: C:\Program Files\wkhtmltopdf

If you don’t have the Python library installed, please open “Command Prompt” (on Windows) and install it using the following code:

pip install pdfkit

In order to continue in this tutorial we will need some HTML file to work with.

Here is a sample HTML file we will use in this tutorial:

https://pyshark.com/wp-content/uploads/2022/06/sample.html

If you download it and open in your browser, you should see:

and opening it in the code editor should show:

Let’s start with converting HTML file to PDF using Python.

The sample.html file is located in the same directory as the main.py file with the code:

First, we will need to find the path to the wkhtmltopdf executable file wkhtmltopdf.exe

Recall that we installed in C:\Program Files\wkhtmltopdf meaning that the .exe file is in that folder. Navigating to it, you should see that the path to executable file is: C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe

Now we have everything we need and can easily convert HTML file to PDF using Python:

And you should see sample.pdf created in the same directory:

Image by Author

which should should look like this.

Using pdfkit library you can also convert webpages into PDF using Python.

Let’s convert the wkhtmltopdf project page to PDF!

In this section we will reuse most of the code from the previous section, except now instead of using HTML file we will use the URL of a webpage and the .from_url() method of pdfkit class:

And you should see webpage.pdf created in the same directory:

Image by Author

which should should look like this.

In this article we explored how to convert HTML to PDF using Python and wkhtmltopdf.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming tutorials.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment