How to verify pdf content

Experiences, small talk, and other automation gossip.
Certified Professional
Certified Professional
Posts: 31
Joined: Wed Mar 25, 2015 4:44 pm
Location: Switzerland

How to verify pdf content

Post by Speedboat » Thu Dec 15, 2016 4:00 pm


I have to automate a web application where a pdf file is created and viewable / downloadable.
1) How can I verify that the pdf is somehow correct
(if the pdf file is corrupt it depends from the browser and the plugin how this is displayed)
2) Can I get the number pages in the pdf by an universal way (for all browsers and plugins)
3) Is it possible to read some content of the pdf file (e. g. if a given string exists)

Currently (Ranorex 6.2.0) Ranorex sees only 1 large area, but no content.
With Firefox and the current plugin I am able to verify at least the number of pages from toolbar.
But this test is brittle and would run only on my machine.

Another workaround would be to download the file and convert it to text format by another tool or simply by ctrl-c and verify text in the clipboard.

If the pdf is not protected and selecting of text is possible - what are Ranorex built in possibilities?

Thanks for any suggestions

User avatar
Ranorex Guru
Ranorex Guru
Posts: 7469
Joined: Mon Aug 13, 2012 9:54 am
Location: Zilina, Slovakia

Re: How to verify pdf content

Post by odklizec » Thu Dec 15, 2016 4:09 pm


It should be possible to read and validate the content of PDF file, but only if the PDF file has enabled "accessibility" option, as described here: ... tml#p25699
Also, as far as I remember, there was a problem with some PDF creators, which the element recognition may not work with PDF files created with something else than Adobe Acrobat.
Pavel Kudrys
Ranorex explorer at Descartes Systems

Please add these details to your questions:
  • Ranorex Snapshot. Learn how to create one >here<
  • Ranorex xPath of problematic element(s)
  • Ranorex version
  • OS version
  • HW configuration