This project has moved. For the latest updates, please go here.

conversion Excel to Pdf

Apr 12 at 2:42 PM
Hi Cognidox,

We use OfficeToPdf in production and it’s doing a great job for us. We do however struggle with the conversion of Excel files. The default settings for printing in Excel are often not fit to the worksheet and rarely set properly by the creator of the workbook. Classic case: columns spread over two pages and about a hundred rows, results in a pdf with the first columns of a row on the first page and what’s left of the row on the third page. And that is not a nice pdf to scroll through…

I’ve taken a look at the optional switches, but nothing seems to provide a way to adjust printer settings. Simply setting orientation to landscape, narrow margins and fit columns to page solves it for most of the cases. Do you think it’s possible to implement options to adjust printer settings?

Or maybe even a better option, I noticed for conversion with Word, there’s a work around: if I make a template with specific print settings and use the /merge /template <template_path> switches, the template settings are used for printing. But for Excel, there’s no such option. Could the possibility to use a template for Excel be implemented as well? It seems like being able to use a template with macro’s could improve the pdf result a lot.

Thanks!
Coordinator
Apr 13 at 1:50 PM
Hi

You're right that at present, there are no options that allow control over Excel layout.

The idea of re-using the /template switch does seem to be a sensible. We'll raise an issue for this and consider it for the next release.

Regards
Vittal
Coordinator
Apr 13 at 3:48 PM
Hi

Try using the build of officetopdf.exe attached to https://officetopdf.codeplex.com/workitem/53

Use the /template option to specify a spreadsheet with your desired print options. There's no need to use the /merge switch.

Regards
Vittal
Apr 14 at 12:20 PM
Hi,

Thanks a lot for the quick response! I'm in the middle of testing and the first results seem promising, I'll get back to you soon with more feedback.

Kind regards
Anja
Apr 20 at 8:52 AM
Edited Apr 20 at 9:16 AM
Hi,

I've tested it some more and we will be implementing this.
I am however currently using a template without macro's. It would be nice to be able to run macro's from the template without the risk of running macro's from the original file, because that would increase the possibilities of adapting to the content.
Currently, without using /excel_auto_macros, no macros are run. With the switch, the macro's from the original file are run, not from the template. When I look at the issue, I guess this is because the template options are copied into the original file and not the other way around.

But either way this is a big step forward for our excel conversion. Do you have an idea when this would make it in to a release build?

Kind regards
Anja
Coordinator
Apr 20 at 11:54 AM
Hi Anja

I've attached another build to the https://officetopdf.codeplex.com/workitem/53 that adds the /excel_template_macros switch. This will call Auto_Open macros on the template file before the page settings in the template workbook have been copied to the original file. This may allow you to make the changes you need.

We don't currently have a date for a release build with these changes, but they will be part of the next release, so you can continue to use these test builds and update safely when we make the next release.

Regards
Vittal
Apr 20 at 2:21 PM
Hi,

Thanks for the update, I'll look in to the possibilities, but it wasn't my original idea.
However, that would probably have been too much work/too specific to implement. As you explain it, the way of working with templates for Excel is different from Word. For the latter, the documentation says a new file is created from the template and the data is copied in to this file. I figured if the same logic is applied to Excel files, you can copy data into the new file and then after that, it might be possible to run the macro from the template. That way, it would be possible to take the actual contents of the workbook into account on adjusting the settings, while not risking to run macro's from the original file.
But For Excel it seems to work the other way around: you copy settings from the template to the original file.
I understand of course the contents of a Word file are very different from an Excel workbook and my idea is probably not as straight forward as it sounds.

Anyway, we'll run with this and when there's a new release I can update to, I'll be sure to give it a five star rating. Thanks for the help!
Anja
Coordinator
Apr 20 at 3:05 PM
Hi Anja

Yes, we don't copy the worksheets from the original document into the template. Rather, we take a sub-set of safe page settings from the template workbook's first worksheet and apply them to all the worksheets in the original document. I can see problems copying all the worksheet data from the original document into the template with references breaking.

One thing to note is that any macros in the template document might potentially be able to access content in the original document. The conversion process opens the original document as the first application workbook, then opens the template as the second workbook. So, your VBA code might be able to call Workbooks.Item(1) to get the original document.

Regards
Vittal
Apr 21 at 7:06 AM
Hi,

Good catch! I should indeed be able to access the contents that way. I'll try and see if that works.

Kind regards
Anja
Coordinator
Apr 21 at 11:48 AM
Anja

FYI, the macro code will run before we pull the page settings data from the first worksheet in the template.

This means that if you have many types of spreadsheets you need to handle, you can have one worksheet in the template containing the appropriate page settings for each spreadsheet type. If your macro code can determine what type of spreadsheet is being converted, it can move the appropriate worksheet to be the first worksheet in the template. The conversion code will therefore use the appropriate page settings for the document being converted.

Regards
Vittal
Apr 25 at 11:50 AM
Hi,
We've updated our system to use the last version, but it seems for some of our machines it's too much to handle and it's impossible to recover.

first time running, the output says:
I get "Retrieving the COM class factory for component with CLSID (..) failed due to the following error: 80080005 server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE))
Did not convert

Next times running, the output gives a bunch of "Unable to set property ..." and ends with:
The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)

and I get following error output:
Unhandled Exception: System.Runtime.InteropServices.COMException: The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)
at Microsoft.Offices.Interop.Excel.Workbook.set_Saved(Boolean RHS)
at OfficeToPDF.ExcelConverter.Convert(String inputFile, String outputFile, Hashtable options)
at OfficeToPDF.Program.Main(String[] args)

No matter how simple the file, it keeps on failing. I've tried repairing Office, but to no avail.

Any idea what's the cause / how to fix?
Coordinator
Apr 25 at 4:54 PM
Hi Anja
  • Are all your machines of the same configuration (same version of Office, Windows, user permissions)?
  • Are they in use by one or more users while officetopdf is running?
  • Is officetopdf the only application calling Office on these machines?
  • Is this just for Excel documents, or other types too?
The Office COM system is really quite fragile as a whole - this is why Microsoft themselves say not to use automation to drive Office. If some machines are busy, or other applications/users are calling into Office, things can get problematic.

Regards
Vittal
Coordinator
Apr 26 at 9:04 AM
Hi Anja

I've attached another build to https://officetopdf.codeplex.com/workitem/53

This attempts to wait a bit longer for Excel to start up on slow/loaded machines before starting the conversion process. This might help with some of your machines.

Regards
Vittal
Apr 26 at 9:16 AM
Hi,
To quickly answer your questions:
  • yes, all machines have the same configuration (Windows 10, Office 2016, admin rights)
  • One user
  • Only officetopdf makes Office calls
  • I've seen a Word file generating a COMException, but for Excel (which now uses the template option) , it's every time.
The machines can get quite busy at times.
Coordinator
Apr 26 at 10:06 AM
Anja

Another question: Are you running multiple copies of officetopdf at the same time, or do you only have one copy running at a time? Running multiple copies at once is probably going to fail consistently for you.

Regards
Vittal
Apr 27 at 9:04 AM
Hi,

No, there's only one instance running at a time.
I've reverted the machine to use the old version (without the template) and that still works.
However, I do see a "Microsoft Excel has stopped working" screen pop up now at every run. So officetopdf seems to manage to do what has to be done, but something is off.
I can't say whether or not that was already the case on that machine before the update.
Coordinator
Apr 27 at 10:32 AM
Hi Anja

Does the latest build on https://officetopdf.codeplex.com/workitem/53 work without the /template option?

Is the template and/or source Excel file on the local machine disk or on a network drive?

How large a file is your template file?

Regards
Vittal
Coordinator
Apr 27 at 11:39 AM
Anja

Try using the build https://www.codeplex.com/Download/AttachmentDownload.ashx?ProjectName=officetopdf&WorkItemId=53&FileAttachmentId=1659741

When running, add the option /excel_delay 2000

This will introduce a two second delay after opening each workbook. This may give slow or heavily loaded machines the space needed for the COM references to appear. Try decreasing or increasing the delay time to see if it has an effect.

Regards
Vittal
Marked as answer by cognidox on 5/12/2017 at 6:34 AM
Apr 27 at 1:52 PM
Hi Vittal,
  • The latest build on ../53 works without the /template option (well, I do of course get the "Microsoft Excel has stopped working" screen at every run)
  • The source and template are local (officetopdf runs on C:, files are on D:)
  • The template is only 8 KB
  • I tested the new build, with the delays, but the result remains the same
  • It probably doesn't help, as I already gave the error output, but just in case: officetopdf always gives exit value -532462766
Kind regards
Anja
Coordinator
Apr 27 at 3:31 PM
Hi Anja

Are you using 32bit or 64bit office? You may see more information about the error in the Windows event log.

We're unable to replicate what you're seeing, so we're not sure we can get any further in diagnosing.

What you might want to consider is doing the PDF export directly from a macro in your template:
  1. Copy the template to a temporary folder with a known filename
  2. Copy the file to be converted to to the folder with another fixed filename
  3. Open the template with excel.exe
  4. Have an auto run macro in the template:
    a. open the file to be converted
    b. set the appropriate page layout options
    c. select the appropriate worksheets from the document to be converted
    d. use ExportAsFixedFormat to output the PDF
    e. close the workbooks and quit Excel
  5. Take the resulting PDF and clean up the temporary files
Although it requires some batch glue and macro work, it should avoid the problems caused by using the Office interop API, since it is all running within Excel.

Regards
Vittal
May 11 at 8:45 AM
Hi,
Thanks a lot for the help. We've been using this for a while now and our pdf output generally looks much better.
It also looks like the issues I ran in to are more of a problem with the office installation, rather than officetopdf.