In our recent project we needed to convert HTML to Pdf. We first tried PHP based library called tcpdf, but it has lots of limitations.
Finally solution was to use Qt. Here is the code written in PyQT:
import sys
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
app = QApplication(sys.argv)
web = QWebView()
web.load(QUrl("http://www.google.com"))
#web.show()
printer = QPrinter()
printer.setPageSize(QPrinter.A4)
printer.setOutputFormat(QPrinter.PdfFormat)
printer.setOutputFileName("file.pdf")
def convertIt():
web.print_(printer)
print "Pdf generated"
QApplication.exit()
QObject.connect(web, SIGNAL("loadFinished(bool)"), convertIt)
sys.exit(app.exec_())
To convert HTML text we can also use setHtml() instead of load() which takes url of the page you want to convert. But on Windows when we use setHtml() the loadFinished() signal was not emitted( bug?? ) whereas it works on Linux.
Pingback: Converting HTML to Pdf with Java and Qt « Kunal Bharati
Nice post..thnkx for sharing
Hi,
U have some questions about above code.
1. It’s possible to convert the local html files like(c:\test.html)
2. Another question is i tries to implement this inside another python program but my python script was terminated. It’s possible to use this code as a function?.
Thanks
loganathan
Thanks for the code. Was it easy to do? I have tried a few programs to convert html to pdf but the one I have found to be the most user friendly is http://docraptor.com. I tried the free version and it was simple.
For the code to work properly, you may have to put the “def” part before the:
app = QApplication(sys.argv)
Otherwise it would give the follow error:
Traceback (most recent call last):
File “./using_pyqt.py”, line 21, in convertIt
web.print_(printer)
RuntimeError: underlying C/C++ object has been deleted
Error in sys.excepthook:
Traceback (most recent call last):
File “/usr/lib/python2.6/dist-packages/apport_python_hook.py”, line 48, in apport_excepthook
if not enabled():
File “/usr/lib/python2.6/dist-packages/apport_python_hook.py”, line 21, in enabled
import re
ImportError: No module named re
Original exception was:
Traceback (most recent call last):
File “./using_pyqt.py”, line 21, in convertIt
web.print_(printer)
RuntimeError: underlying C/C++ object has been deleted