GrapeCity Documents

Tagged PDF Documents

An accessible PDF is also referred to as a tagged PDF document. PDF tags have names similar to HTML tags. Here is a list of standard tags used in PDF documents. Adding these tags has no visual effect on the document. They add a hidden structure to the document which represents the document content in a manner recognizable by a screen reader or other text to speech recognition software.

GrapeCity Documents for PDF (GcPdf) allows users to create tagged PDF documents. Below we discuss how to add tags to different content elements such as text, paragraphs, lists, images, etc.

How to Create a Tagged PDF Document

A PDF document is composed of different content elements such as text, paragraphs, lists, images, and tables. Each of these elements can be represented by a standard PDF tag such as ‘P’ for paragraph, ‘L’ for list, and ‘Figure’ for an image.

When creating a tagged PDF document, add a tag for each content element. This collection of these tags is represented by a tree with child nodes. This tree is presented to the screen reader, which then uses it to read the PDF content out loud for people with disabilities.

Following the same approach as described above, GcPdf tags a PDF document using a set of structural elements. Each structural element represents a content element in the document.

For example, to represent a paragraph create a structural element of the ‘P’ tag type. structural elements are represented by StructElement class provided by the GcPdf library.

Follow the steps below to create a tagged PDF document with GcPdf:

  1. Create a new PDF document by initializing the GcPdfDocument class and add a new page to it by accessing the Pages property of the GcPdfDocument class.
  2. Fetch the page graphics by accessing the Graphics property of the Page class. The returned GcPdfGraphics class instance will be used to render the document content and create the logical structure tree by adding the appropriate tags.
  3. Create a container element by initializing an instance of the StructElement class and it to the tree root by accessing the StructTreeRoot property of the GcPdfDocument class. The container elements are added at the highest level of hierarchy to provide grouping for other block-level elements.
  4. Create a block-level element by initializing an instance of the StructElement class, based on the type of content that you adding to the document (such as paragraph, list, table, and images)
  5. Add the block element as a child in the container element (created in the above step) by accessing the Children property of the StructElement class.
  6. Generate the content for the block-level element by invoking the appropriate method such as DrawImage for rendering image, DrawTextLayout for rendering text/paragraphs, and so on. The content being generated must be rendered as marked content by invoking the BeginMarkedContent method of GcPdfGraphics class which must be enclosed by the EndMarkedContent method of GcPdfGraphics class. The BeginMarkedContent method accepts a parameter of type TagMcid class, which acts as the identification of the marked content.
  7. Append the generated marked content to the related structure element by accessing the ContentItems property of the StructElement class.
  8. Set the Marked property of MarkInfo class to True, by accessing the MarkInfo property of the GcPdfDocument class, this would indicate that the document conforms to the Tagged PDF conventions.
  9. Save the PDF document by invoking the Save method of the GcPdfDocument class.

#web #.net #document apis #c# #programming-c #csharp

Document Accessibility for PDF Documents in C# .NET
2.60 GEEK