Status: | obsolete |
Progress: | 100% |
Version: | 2.2.0 - 2.5.8 |
Archive:Word format
Word format | |
---|---|
Outputs the result in Microsoft Office Word file format (doc/docx). | |
Further Information | |
Provided by: | Extension "Semantic Result Formats" |
Added: | |
Removed: | |
Requirements: | MW 1.21+ "PhpOffice/PhpWord" library) or MW 1.22+ |
Format name: | word |
Enabled? Indicates whether the result format is enabled by default upon installation of the respective extension. | yes |
Authors: | Wolfgang Fahl |
Categories: | export |
Table of Contents | |
The result format word is used to format query results as a word file.
If the PHPWord library (required) is installed this format will automatically be available (SRF ≥ 1.9.1).
Parameters[edit]
- templatefile - the name of a docx word file containing ${needle} placeholders. It is automatically searched for in the File: namespace.
Example[edit]
Example[edit]
We'd like to get a table of cities that have a population of more than 1 million people sorted by population
{{#ask: [[Category:City]] [[Population::>1000000]] |?Population |sort=Population }}
- Same query for Word format
{{#ask: [[Category:City]] |?Population |searchlabel=Download result as Word file |templatefile=GermanCities.docx |format=word }}
Preparing a Template File[edit]
The Template file needs to have ${needle}
placeholders where the field results are to be inserted, e.g. ${population}
would hold the population result.
Caveats[edit]
Unfortunately when saving Microsoft Word files extra characters might get inserted see:
See this issue on stackoverflow.com
To avoid this you might want to
- switch off correction mode (which might add red markups)
- use cut&paste in a formatless mode
You might want to check that the needles $ { … } where not spoiled in the resulting Docx xml format. You can check this by unzipping the docx file and looking into the word/document.xml file.
A Tool like xmlstarlet can help with doing this.
Here is a few lines of bash script as an example
unzip -o GermanCities.docx
for keyword in population
do
xmlstarlet fo word/document.xml | grep $keyword
done
The result should look like:
<w:t>${population}</w:t> …
As a script "caveat" this looks like:
#!/bin/bash
# Copyright (C) 2015 BITPlan GmbH
# wf 2015-09-29
# check that a word template is ok for being used with the
# SMW word result format
# see http://semantic-mediawiki.org/wiki/Help:Word_format
#
# show usage
#
usage() {
echo "usage: $0 wordtemplatefile keywords"
exit 1
}
# check command line parameters - there must be at least one
if [ $# -lt 2 ]
then
usage
fi
file="$1"
keywords="$2"
if [ ! -f $file ]
then
echo "$file does not exist" 1>&2
exit 1
else
unzip -o $file > /dev/null
for keyword in $keywords
do
xmlstarlet fo word/document.xml | grep $keyword
done
fi
Installation[edit]
This describes how to install the required PHPWord library with Composer, which is recommended method for MW 1.22+. Either enter the following in you command line:
composer require phpoffice/phpword dev-master
or add the following as the last line of the "require" section in your "composer.json" file:
"phpoffice/phpword": "dev-master"
Note: Replace the version number "dev-master" of this example with the version number you want to install at your convenience.
Patching TemplateProcessor.php for Image handling[edit]
If you'd like to insert Images into your word file you might want to patch the TemplateProcessor.php file of PhpOffice/PhpWord like this: see
The SRF_Word format will automatically detect that the method searchImageId is available and will use it.
neso:PhpWord wf$ rcsdiff TemplateProcessor.php
===================================================================
RCS file: RCS/TemplateProcessor.php,v
retrieving revision 1.1
diff -r1.1 TemplateProcessor.php
61a62,68
>
> /**
> * Content of document rels (in XML format) of the temporary document.
> *
> * @var string
> */
> private $temporaryDocumentRels;
101a109
> $this->temporaryDocumentRels = $this->zipClass->getFromName('word/_rels/document.xml.rels');
508a517,583
> //
> // Image handling
> // see http://stackoverflow.com/questions/24018003/how-to-add-set-images-on-phpoffice-phpword-template
> //
>
> /**
> * Set a new image
> *
> * @param string $search
> * @param string $replace
> */
>
> public function setImageValue($search, $replace){
> // Sanity check
> if (!file_exists($replace))
> {
> return;
> }
>
> // Delete current image
> $this->zipClass->deleteName('word/media/' . $search);
>
> // Add a new one
> $this->zipClass->addFile($replace, 'word/media/' . $search);
> }
>
> /**
> * Search for the labeled image's rId
> *
> * @param string $search
> */
>
> public function searchImageId($search){
> if (substr($search, 0, 2) !== '${' && substr($search, -1) !== '}') {
> $search = '${' . $search . '}';
> }
> $tagPos = strpos($this->tempDocumentMainPart, $search);
> $rIdStart = strpos($this->tempDocumentMainPart, 'r:embed="',$tagPos)+9;
> $rId=strstr(substr($this->tempDocumentMainPart, $rIdStart),'"', true);
> return $rId;
> }
>
> /**
> * Get img filename with it's rId
> *
> * @param string $rId
> */
>
> public function getImgFileName($rId){
> $tagPos = strpos($this->temporaryDocumentRels, $rId);
> $fileNameStart = strpos($this->temporaryDocumentRels, 'Target="media/',$tagPos)+14;
> $fileName=strstr(substr($this->temporaryDocumentRels, $fileNameStart),'"', true);
> return $fileName;
> }
>
> /**
> * set the image with the given searchAlt alternate text
> * @param searchAlt - the alternate text to search for
> * @param replace - the image filename to replace the image with that is found
> */
> public function setImageValueAlt($searchAlt, $replace){
> $_rid=$this->searchImageId($searchAlt);
> $_imagefile=$this->getImgFileName($_rid);
> $this->setImageValue($_imagefile,$replace);
> }
>
>