FAQs: PDF to HTML Conversion for Digital Commons Websites
Frequently Asked Questions
A Digital Commons Web Manager can use the service to convert PDFs into an HTML web page.
- The service uses AI to analyze each PDF, extract text, images, and structure and convert it into HTML code.
- The AI will attempt to generate alt text for images, markup decorative images, eliminate pagination, and eliminate duplicative headers and footers in the PDF document.
- The service will generate an "auditor" report on the accuracy of the conversion for human review.
- The generated HTML should be compliant with WCAG 2.1 Level AA, but some human review may be needed.
- The web manager can choose to finalize the conversion by taking an action to publish as HTML.
- The Digital Commons site styles are automatically applied to the converted HTML web page, specifically, a site page content type.
- Once converted, the Web Manager can make edits to the page.
- All links to the PDF will be replaced by links to the converted HTML page.
- A Web Manager can choose to revert the conversion back to a PDF at any time, however, any changes made to the HTML page after conversion will be lost after reverting.
At this time, some PDFs are not able to be converted by the service.
List of exclusions (cannot be converted using this service):
- PDF forms / Fillable Forms
- Password‑protected PDFs
- Presentation slides saved as PDF
- A single-image saved as a PDF
Any PDFs published after April 24, 2026 must be compliant with WCAG 2.1 Level AA according to Title II ADA.
PDFs published prior to April 24, 2026 need to be compliant with WCAG 2.1 Level AA unless it qualifies for an exception. Discuss with your agency's General Counsel to determine if a particular PDF qualifies for an exception to Title II ADA.
No. No automated tool can guarantee 100% compliance with WCAG 2.1 Level AA. This service significantly reduces the amount of work required to manually review web content. Some final human evaluation may still be needed.
No. The conversion focuses on content accuracy and accessibility, not visual replication. Expect fonts, styling, and layout to change. The final look of the converted document will use the styles of the Digital Commons style theme. The service will generate an "auditor" report to help identify areas where human review is needed.
Mostly. The AI uses OCR (Optical Character Recognition) for scanned content, but quality depends on the clarity of the original scan. Manual corrections may be needed.
This service only supports PDF files. For accessibility support on additional file types, please follow our accessibility guidance for other document file types.
No. All existing links to the PDF will automatically redirect visitors to the converted HTML site page. The web manager will not need to take extra steps to set up the redirect links.
Yes. A converted HTML page can be reverted back to the original PDF at any time.
However, any edits made to the HTML page will be lost after reverting.
No. Once converted, the new HTML site page replaces the PDF.
Yes, however, the Digital Commons team will not provide the conversion interface. Non-DC site managers should expect to need to manage the removal of the old PDFs and replacement with the converted HTML pages themselves. More details to come.
- For Digital Commons websites this service should be available starting March 30, 2026 (subject to change).
- For Non‑DC websites, still working out the dates.
Free for Digital Commons websites.
For agency websites not hosted on Digital Commons, the PDF-to-HTML solution should cost as little as $0.03 per page. Details are still being finalized.
The recommended review approach is outlined in the Digital Commons guide. Note: More details and updated screenshots/steps will be available later in April.
No. PDFs are sometimes a necessary file type for sharing content on the web. However, we encourage using HTML as much as possible.
It is the decision of the site Web Manager to choose which files are converted. No files are converted automatically. The web manager uses an interface to select/preview the PDFs for conversion.
Yes, just like any site page.
A goal of the tool is to preserve the content of the original document. To promote fidelity, we've chosen to instruct the AI to preserve errors and typos to limit the AI's attempt to guess or hallucinate.
Please reach out to the Digital Commons team if this is something you are interested in for your site.
If you are asking this question, you are probably thinking about making sure your documents are compliant with Title II ADA. Thank you!
Use this guide to help you think through which documents to prioritize: Strategy to Remediate a Large PDF Library
The AI used by the service is able to take an un-tagged inaccessible PDF and convert it into a structured, accessible HTML page. You shouldn't need to do any document remediation work before the conversion. But you may need to verify some aspects of the accessibility of the converted HTML page depending on the content and complexity of the original document.
Rows that are disabled / grayed-out in the conversion management page represent PDFs that cannot be converted to HTML because they meet the exclusion criteria or because the document did not meet our quality standards based on automated auditing analysis. Further improvements to this tool will increase the number of files available for conversion.
The PDF to HTML conversion tool is created by the NCDIT Digital Solutions, the NCDIT Office of AI & Policy, and the NCDIT Accessibility Support Team in partnership with the Google Rapid Innovation Team and Nerdery.
This page was last modified on 03/31/2026