File Includer

plugins/cap-file-includer/cap-file-includer.php

Plugin Name: Capitularia File Includer Plugin URI: Description: Includes external HTML files in Wordpress pages.

Version: 0.1.0 Author: Marcello Perathoner Author URI: License: GPLv2 or later Text Domain: cap-file-includer Domain Path: /languages

Capitularia File Includer plugin.

The File Includer plugin registers a Wordpress shortcode that allows to include any external HTML file in a Worpdress page. We use this shortcode to put the transcribed manuscripts into Wordpress.

The TEI files are transformed into HTML files on the Capitularia VM. On that server we maintain up-to-date python and java installations. A customary Web Projekt at uni-koeln.de does not include those or includes outdated versions of them.

This plugin also stores the included text into the Wordpress database. This makes the built-in Wordpress search function work with the included material.

Note

Currently (Nov. 2019) the plugin also does some post-processing of the HTML files. This code will also be rewritten and moved to the VM.

The format of the shortcode is:

[cap_include path="path/to/file.html" post="true"]
param str path:

Path of the file to include, relative to the root on the settings page.

param str post:

Optional. If the included file should be post-processed by the footnotes-post-processor then set this parameter to true.

See also: the Page Generator plugin, which generates batches of page stubs from directories of TEI files. Those stubs usually contain the shortcodes for this plugin.

constant NAME

‘Capitularia File Includer’

The name of the plugin.

constant DOMAIN

‘cap-file-includer’

The Text Domain of the plugin.

constant OPTIONS

‘cap_fi_options’

The Wordpress ID of the settings (option) page.

plugins/cap-file-includer/class-file-includer.php

Capitularia File Includer Main Class

class FileIncluderEngine

Implements the inclusion engine.

One difficulty is to get in early enough so that the qtranslate-x plugin has not translated away the unwanted languages. We need all languages to be there when we have to save the page. qtranslate-x hooks into ‘the_posts’ so we must too.

The other difficulty is to protect the included content from the wpautop and wptexturizer filters, which were implemented with boundless incompetence and try to put <p>’s around the included content everywhere and fuck up the HTML attributes with curly quotes.

To get around those filters we insert <pre> tags around the included content, which is the only way to fend those filters off for some portion of a page, instead of disabling them wholesale. We double the <pre> tags in this way: <pre><pre>…</pre></pre> so that we can filter them out again later without danger of removing tags of other provenience.

We have to save the included content to the database to make it searchable by the built-in Wordpress search engine.

property do_save

(bool) Do we have to save the post?

property post

(WP_Post) A ref to the post being processed.

on_shortcode_early(atts, content)

Process our shortcodes. Step 1: Include the file.

Called very early from on_the_posts ().

See: cceh\capitularia\file_includer\on_the_content_early()

Parameters:
  • atts (array) – The shortcode attributes.

  • content (string) – The shortcode content.

Returns:

The content to insert into the shortcode.

Return type:

string

on_shortcode(dummy_atts, content)

Process our shortcodes. Step 2.

Called after wpautop and wptexturizer did their nefarious work. Clean up the <pre> tags we inserted only to protect against them.

Parameters:
  • dummy_atts (array) – (unused) The shortcode attributes.

  • content (string) – The shortcode content.

Returns:

The content with <pre> tags stripped.

Return type:

string

on_the_posts(posts, query)

Process our shortcodes. Step 1.

We are forced to hook into ‘the_posts’ because the qtranslate-x plugin does it this way and we must get in before qtranslate-x has translated away the unwanted languages.

We cannot save inside the on_shortcode_early hook because there may be more than one shortcode on the page and besides there may be other content too.

Parameters:
  • posts (WP_Post[]) – The array of posts.

  • query (WP_Query) – The query.

Returns:

The array of posts with our shortcode processed.

Return type:

WP_Post[]

plugins/cap-file-includer/class-settings-page.php

Capitularia File Includer Settings Page

class Settings_Page

Implements the settings (options) page.

Found in Wordpress admin under Settings | Capitularia File Includer.

__construct()

Constructor

Add option fields so we can use the Wordpress function do_settings_sections() to output them.

Also register one POST parameter to be handled and validated by Wordpress. We want all user entries to be returned into PHP as one string array called OPTIONS_PAGE_ID. This array will be passed by Wordpress to the validation function and stored in the database all in one row.

See: http://planetozh.com/blog/2009/05/handling-plugins-options-in-wordpress-28-with-register_setting/ Blog post: how to store all plugin options into one database row.

Return type:

self

display()

Output the Settings page.

Return type:

void

Raises cceh\capitularia\file_includer\InvalidArgumentException:

if the provided argument is not of type ‘array’.

on_options_section_general()

Output the ‘general’ section.

Return type:

void

on_options_field_root()

Output the root option field with its description.

Return type:

void

on_options_field_shortcode()

Output the shortcode option field with its description.

Return type:

void

sanitize_path(path)

Sanitize a field that should contain a path.

Parameters:
  • path (string) – The path to sanitize

Returns:

Sanitized path without trailing slash.

Return type:

string

on_validate_options(options)

Validate options entered by user

We get all user entries back in one associative array so that we can store them in one database row. This makes validation somewhat more difficult.

See: cceh\capitularia\file_includer\__construct()

Parameters:
  • options (array) – Array of key, value: the options as entered on the form.

Returns:

Array containing the validated options

Return type:

array

plugins/cap-file-includer/footnotes-post-processor-include.php

Capitularia Footnotes Post-Processor Include File

This script processes the output of the xslt transformation. Here we do those things that are easier in PHP than in XSLT:

  • Merge adjacent footnotes and move footnotes to the end of the word.

  • Drop footnotes followed by an editorial note in the same word.

  • Insert footnote refs and backrefs and numbers them sequentially.

  • Wrap initials (dropcaps) and the following word into a span.

  • Substitute editors’ shortcuts with proper mediaeval punctuation.

  • Accept XML or HTML input, always output HTML.

This file only declares symbols (classes, functions, constants) in accordance with PSR-2.

constant FOOTNOTE_SPAN

‘//span[@data-note-id][not (ancestor::div[@class=”footnotes-wrapper”])]’

constant FOOTNOTE_REF

‘a[contains (concat (” “, @class, ” “), ” annotation-ref “)]’

is_note(node)

Is the node a note?

Parameters:
  • node (DOMNode) – The node to test.

Returns:

true if the node is a note.

Return type:

bool

add_class(node, class)

Add a class to a node.

Manages multiple classes .

Parameters:
  • node (DOMElement) – The node.

  • class (string) – The class to add.

Return type:

void

has_class(node, class)

Test if node has class.

Parameters:
  • node (DOMElement) – The node.

  • class (string) – The class to test.

Returns:

True if the node has the class.

Return type:

bool

is_text_node(node)

Test if node is a text node.

Parameters:
  • node (DOMElement) – The node.

Returns:

True if the node is a text node.

Return type:

bool

remove_node(node)

Remove node from parent.

Parameters:
  • node (DOMElement) – The node to remove.

Return type:

void

merge_notes(note, next)

Merge $note into $next.

Parameters:
  • note (DOMNode) – The note to merge.

  • next (DOMNode) – The note to merge into.

Return type:

void

wrap(nodes)

Wrap $nodes into a span.

Parameters:
  • nodes (array) – Nodes to wrap.

Return type:

void

word_end_pos(text_node)

Return the position of the character after the first word in $text_node.

$text_node must be a text node.

Parameters:
  • text_node (DOMNode) – The text node.

Returns:

Position of first whitespace or false.

Return type:

mixed

query_copy(xpath_query_result)

Copies the result of an XPath query into an array.

Parameters:
  • xpath_query_result (DOMNodeList) – The XPath result.

Returns:

An array of nodes.

Return type:

array

insert_footnote_ref(elem, id)

Insert a footnote reference into the document.

Parameters:
  • elem (DOMElement) – The element after which insertion should take place.

  • id (string) – The id of the footnote.

Return type:

void

insert_footnote_backref(elem, id)

Insert a footnote back reference into the document.

Parameters:
  • elem (DOMElement) – The element after which insertion should take place.

  • id (string) – The id of the footnote.

Return type:

void

post_process(doc)

Post process the footnotes, etc.

Parameters:
  • doc (DOMDocument) – The document to process.

Returns:

The processed document.

Return type:

DOMDocument

load_xml_or_html(in)

Load XML or HTML.

We have (had) a mix of transformation scripts outputting either XML or HTML so we must read both formats.

Parameters:
  • in (string) – The XML or HTML as string.

Returns:

The new document.

Return type:

DOMDocument

save_html(doc)

Convert the document to HTML.

We need the document as HTML because it gets embedded into a wordpress page. Also we need to get rid of <DOCTYPE>, <html>, <head>, and <body>. We do this by starting output at the topmost <div>.

Parameters:
  • doc (DOMDocument) – The document as DOM.

Returns:

The document as embeddable HTML.

Return type:

string

plugins/cap-file-includer/functions.php

Capitularia File Includer functions.

The main difficulty here is to get around the wpautop and wptexturizer filters that were implemented with boundless incompetence.

make_shortcode_around(atts, content)

Put shortcodes and <pre> tags around the content.

Parameters:
  • atts (array) – The shortcode attributes.

  • content (string) – The shortcode content.

Returns:

The content surrounded by shortcodes and <pre> tags.

Return type:

string

strip_pre(content)

Strip <pre> tags from around the content.

Parameters:
  • content (string) – The content to strip.

Returns:

The stripped content.

Return type:

string

ns(function_name)

Add current namespace

Parameters:
  • function_name (string) – The class or function name without namespace

Returns:

Name with namespace

Return type:

string

get_opt(name, default)

Get an option from Wordpress.

Parameters:
  • name (string) – The name of the option.

  • default (string) – The default value.

Returns:

The option value

Return type:

string

get_root()

Get the configured root directory.

Returns:

The root directory

Return type:

string

on_init()

Initialize the plugin.

Return type:

void

on_admin_menu()

Add menu entry to the Wordpress admin menu.

Add a menu entry for the settings (options) page to the Wordpress settings menu.

Return type:

void

Add a link to our settings page to the plugins admin dashboard.

Adds hack value.

Parameters:
  • links (array) – The old links

Returns:

The augmented links

Return type:

array

plugins/cap-file-includer/post-process-cli.php

Capitularia File Includer Main Class