KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
[Mentor Graphics] A Perforce-based Automatic Document Generation System
1.
MERGE 2013 THE PERFORCE CONFERENCE SAN FRANCISCO • APRIL 24−26
White Paper
This paper describes the automatic documentation
generation system that DVT Technical Publications
uses to generate and bundle the InfoHub
documentation libraries for our product distribution
software. The backbone of this system is our
Perforce installation, which provides the document
control and management portion of our system.
A Perforce-Based Automatic
Documentation Generation System
Chris Shaw, Mentor Graphics Corporation
2. 2 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
Introduction
Mentor Graphics® Technical Publications provides InfoHub™ documentation environments to
its customers. Each InfoHub is a product-specific documentation library that gives the
customer the choice of consulting both HTML and PDF versions of every manual. InfoHubs are
highly interconnected; they have many inter- and intra-manual links and a powerful search
capability. Authors create documents with the FrameMaker® document editor and use a cloud-
based, multiprocessing utility to generate the HTML, PDF and InfoHub targets from the
FrameMaker source documents. In a different vein, the engineering groups in Mentor
Graphics’ Design Verification Technology (DVT) division use the Perforce Software Version
Management system for their software code.
Both processes hint at a synergistic opportunity where we combine them in a single
documentation generation and control system. This white paper describes the automatic
documentation generation system that DVT Technical Publications uses to generate and
bundle the InfoHub documentation libraries for our product distribution software. The backbone
of this system is our Perforce installation, which provides the document control and
management portion of our system.
Mentor Graphics InfoHubs
Questa® Formal Technology is a Mentor Graphics family of design verification products that
analyze various assertions about the IC design units being verified. The documentation for this
product suite is packaged as an InfoHub—a Javascript-based browser page of the documents
in the products’ library (see Figure 1) [1]
Figure 1: Questa Formal Technology InfoHub
Each manual is available in both HTML and PDF, plus the InfoHub has a sophisticated search
capability. The documents are highly interconnected with hyperlinks, and the software GUIs
have links from dialog Help buttons to command pages in references.
3. 3 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
DOCGEN Facility
Writers author Technical Publications’ documentation in Adobe® FrameMaker (version 8). The
source files for these documents are .fm files and imported graphics (typically JPEGs).
Mentor Graphics’ Technical Publications Support team supplies a DOCGEN facility that
generates the HTML and PDF targets from the source FrameMaker files [1]. Jobs consisting of
multiple manuals can be submitted. The facility farms out the manual translations to a grid of
servers to handle in parallel.
The DOCGEN facility is accessed either through a Linux utility (docgen) or via a web site.
Publications groups who use the web site typically generate the HTML/PDFs at the end of a
release cycle. This process might take about a week.
The DVT publications group uses a different approach based on the Linux docgen utility. With
our documentation generation system, docgen runs automatically as documents are edited
and are submitted to the depository. This mechanism results in a “correct-by-construction”
InfoHub image. It has the advantage of being continually ready to promote into the distribution
software package. Last-minute changes can make it into the release and the final target is
ready to go within minutes.
The final step in the generic document generation process is to update the InfoHub. Mentor
Graphics’ Technical Publications Support team provides a Linux utility (dmerge) to do just that.
Documents in Perforce
DVT source documents are stored in a techpubs depot in the division’s dvt depot (see Figure
2). Each techpubs subdirectory corresponds to a documentation library (except for bin and
archive). Some document libraries are packaged as InfoHubs. Others are not; instead, they
are inserted into multiple InfoHubs.
Figure 2: Techpubs depot in Perforce
The source files for the InfoHub shown in Figure 1 are located in the //dvt/techpubs/zin10.1
depot (see Figure 3) [2]. Each subdirectory corresponds to a single manual. For example, the
command_ref
subdirectory contains the source files for the Questa
Formal
Technology
Command
4. 4 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
Reference.
Figure 3: zin10.1 Contains source for InfoHub
The Technical Publications organization has strict rules on naming conventions and manual
directory organization. Figure 4
shows an example.
Figure 4: command_ref Structure
All .fm
and .book
files are stored at the top level of the manual directory; imported graphics are
stored in the graphics
subdirectory. The man.book
file is the “book” file for the manual. README
is a control file used by the DOCGEN facility.
5. 5 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
The documentation administrator maintains the depot, adds manuals, and sets up the InfoHub
support files. Each author can check out a manual, edit and write, then check in the manual
when done. That is the entire flow from the Perforce perspective. So, a writer with minimal
Perforce knowledge can be productive right away.
pubs4d
Utility
The pubs4d
utility is a Perl program that provides the glue for the automated system. It is a
daemon program that runs continually; it occasionally wakes up and looks for newly-submitted
manuals; it calls the docgen
and dmerge
utilities; and it performs some clean up and sanity
checks. The pubs4d
utility is a wrapper—DOCGEN does the heavy lifting.
The pubs4d
utility displays two xterm windows showing the log and the detailed log of the
session. The detailed log shows the output from the docgen/dmerge
utilities. The log shows
top-level information on the progress of the translation. Authors also can display the log
windows with a separate utility.
The following LOG1 transcript shows a translation of two manuals. The utility sends the job to
docgen, waits, and then 5 minutes later, the HTML/PDF targets are ready.
Thursday, Oct 18, 2012 5:39 PM: Processing docgen job with 2
books.
0.0 Syncing FM files in /zin/pubs/docs/build directory.
zin10.1 quickstart_autocheck_user (10.1c_1)
zin10.1 quickstart_cdc_user (10.1c_1)
0.1 Sending jobs to docgen.
5.2 -->zin10.1 quickstart_cdc_user....OK.
Fixing HTML for /zin/pubs/docs/dev/zin10.1/htmldocs.
Copying conversion reports to dev/zin10.1/LOG.
Checking for Warnings.
Found 2 warnings.
Updating master build report.
5.3 -->zin10.1 quickstart_autocheck_user....OK.
Fixing HTML for /zin/pubs/docs/dev/zin10.1/htmldocs.
Copying conversion reports to dev/zin10.1/LOG.
Checking for Warnings.
Found 1 warning.
Updating master build report.
The double-xterm method of showing results is quite useful. It is done by calling with the following Perl
subroutine twice (once for each log):
#---- Sub: display_xterm_log <geometry>, <title>, <logfile>
sub display_xterm_log {
if (fork == 0){
system 'xterm', '-sb', '-sl', '4000', '-geometry',
@_[0], '-title', @_[1], '-e', '/usr/bin/perl','-e',
qq^
$ptr = 0;
while (1){
$| = 1;
sleep 3;
next unless open LOG, "@_[2]";
seek LOG, $ptr, 0;
while (<LOG>){print}
$ptr = tell LOG;
close LOG
}
sleep; ^;
exit
} return
}
The pubs4d
utility uses p4
filelog
to find the manuals submitted since the last docgen
job. A build
directory and doc build client are used to handle the doc builds. Here is the Perl code:
6. 6 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
$_ = `p4 -c cshaw-build sync -p $depot/$dir/... 2>&1
|tee -a $log2`;
if ($? != 0 or /can't sync/s){
print LOG1 "(Error: sync failed. Skipping....)n";
next MANUAL
}
After all changed manuals are synced, the utility calls docgen
and waits:
printf LOG1 "%4.1ftSending jobs to docgen.",
(time - $time)/60;
$_ = `docgen -source $docgen_driver_file 2>&1
|tee -a $log2`;
sleep 240; #--- give docgen a chance to get started
print LOG1 "n";
As each manual completes, pubs4d
copies the HTML, PDF, logs and reports to a dev
directory
and updates a Build Reports web page. After all docs in the job are processed, the utility
checks to see if any other manuals have changed since the last job and processes them. Once
all pending manuals are built, the utility runs dmerge
to update changed InfoHubs.
$_ = `dmerge $devdir/$infohub_type$version/htmldocs
-add_global_elements 2>&1 |tee -a $log2`;
The utility also runs a checklinks
subroutine that checks all hypertext links in the InfoHub’s
documents and verifies that their targets exist and links are well formed. This subroutine
creates a report for each InfoHub. Here is an excerpt (for product releases, you want to resolve
all hypertext link issues):
zin10.1 InfoHub Links Report
Topics: 9021
XRef: 12590; Missing Targets: 11
GoTo: 2878; Vacuous Links: 2;
Missing Links: 237; Bad Targets: 49
Ambiguous GoTo target: autocheck_user ==> 'CASE_DEFAULT'
from command_ref/tcl11.html
Missing GoTo link: ' -togglecountlimit ' in
questa_sim_ref/a_commands_vsim1.html
Missing GoTo link: ' -t' in
questa_sim_ref/a_commands_p_vcn067.html
. . .
Missing GoTo target: autocheck_user ==> 'ONE_COLD' from
zeroin_rh/highlights3.html
Missing GoTo target: autocheck_user ==> 'ONE_HOT' from
zeroin_rh/highlights3.html
. . .
Missing Xref target: 'a_functions157.html#CRefID59132"' from
questa_sim_fli/a_catalog1.html
Missing Xref target: 'a_ver_plan16.html#CRefID32138' from
questa_sim_vm/a_test_track3.html
. . .
Vacuous GoTo link: before ' ' in
questa_sim_user/a_sdf_timing21.html
Vacuous GoTo link: before '-learn <r'
in questa_sim_ref/a_commands_vsim1.html
Tech
Pubs
Build
Reports
A Tech Pubs Build Reports web page (see Figure
5) is the hub for information about the results
of document generation. As pubs4d
processes a manual, it updates this page with information
about the manual’s translation. The utility also copies ancillary reports generated by docgen
to
locations linked to on this web page.
7. 7 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
Figure 5: Tech Pubs build reports web page
A red (X) next to the depot name indicates the translation was unsuccessful. It links to a log
returned by docgen.
{DG} 13:21:29 >> DocGen job started on Mon Sep 10 13:21:...
{DG} 13:21:29 >> DocGen Last Updated: Tue Aug 21 12:54:1...
{DG} 13:21:29 >> DocGen Job Host: sofa
{DG} 13:21:30 >> FrameMaker File: man.book
{DG} 13:21:30 >> check_book_md5sum output:
{DG} 13:21:32 >> WARNING: Handle quick_ref found in this
file does not match handle qstatic_rn found
in framemaker project.
{DG} 13:21:33 >> Check Links status: w
{DG} 13:21:36 >> Copying output files to destination:
sje:/zin/pubs/docs/dev/qstatic10.2
ERROR: PDF file not generated: /wv/techpubs/docgen/jobs...
{DG} 13:21:37 >> Copying output files to exact destination:
sje:/zin/pubs/docs/dev/qstatic10.2/LOG
ERROR: PDF file not generated: /wv/techpubs/docgen/jobs...
{DG} 13:21:37 >> Job will be skipped because FrameMaker
source has not changed since last DocGen
conversion.
{DG} 13:21:37 >> DocGen job finished with exit code: 100
A red (X) before the conversion timestamp links to a page of HTML translation warnings.
These are typically broken references that are also caught by checklinks. A green check (ü)
indicates HTML translation had no warnings. The document handle (for example command_ref)
links to the HTML Conversion Report (see Figure 6) generated by docgen.
8. 8 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
Figure 6: HTML conversion report for a manual
Native-drawn FrameMaker graphics must follow strict rules to ensure HTML counterparts are
rendered properly. These rules are defined by the Technical Publications Support team.
However, figures still are prone to mistakes and corrupt rendering. A useful docgen
report
shows only the generated graphics images in the corresponding manual. The graphics link in
the Build Report entry for the manual displays this report (see Figure
7) [3].
Figure 7: Graphics report for a manual
9. 9 A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
The Tech Pubs Build Reports page has a side bar that links to various InfoHub-related
information and pubs4d
session logs. Each entry in the InfoHubs
section brings up its
associated InfoHub. Authors can check how their edits appear in the documentation set in
(virtual) real time. The Checklinks
entries bring up the checklinks reports for the associated
InfoHubs. The Build
Logs
are the LOG1 and LOG2 outputs of the pubs4d
utility.
Author
Work
Flow
The work flow for authors is remarkably simple. We use the Perforce system (either command-
line or P4V
interface, or typically both). The author checks out a manual, performs edits, and
checks the manual back in. The pubs4d
daemon (with the help of docgen) performs the
translations and InfoHub update. Meanwhile, the author monitors translation progress from an
xterm log.
When translation and InfoHub generation are complete, the author checks the Tech Pubs Build
Reports web page for errors and warnings. The author also can check the graphics report,
bring up the associated InfoHub to check document edits and look at the checklinks report to
find and debug bad hypertext links.
Administrator
Work
Flow
The documentation system administrator performs the manual tasks. Surprisingly, these are
minimal and uncomplicated. Aside from maintaining the build/release utilities (pubs4d
and
pubs4), the administrator handles adding and removing manuals and InfoHubs.
Creating a custom InfoHub is a Tech Pubs procedure and just entails copying an existing hub
and modifying several files. Work to insert the new InfoHub into the Documentation Generation
System consists of putting the InfoHub into the dev
area, updating the Build Reports page
manually, and updating pubs4d
to recognize the depots that contain manuals to process.
Adding a manual to the system consists of adding an entry for it in the Build Reports page and
adding the initial document to the Perforce techpubs
depot.
Adjustments can be made by authors as well as the administrator: .fm
files and the graphics
files are added to, or removed from, the depot. Our experience is that administrator
intervention is only needed when major changes happen, such as rolling over the software for
a major release (which happens once a year).
History
The DVT Tech Pubs Documentation Generation System has been in operation for about 6
years. In that time, it has evolved considerably. It originally started out running HTML and PDF
translators separately and had loads of sanity-checking code. It only ran on old Solaris boxes.
In addition to being painfully slow, it ran translations sequentially. An extensive document
rebuild—for example, 8 large books—might take more than 10 hours to complete.
This was OK. We could wait overnight and hope the build performed without error. But, the
system still replaced the tedious manual task of running translators directly and comparing
logs.
Since then, the Technical Publications Support team created docgen, which moved FM-to-
10. 1
0
A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
HTML/PDF translation to a Solaris grid (of old but big machines). The interface is now Linux
based and the translations are performed in parallel. The docgen
utility even supports entry and
target points at our various sites around the world.
Now, an extensive document rebuild of more than 8 manuals might take an hour—if the
manuals are huge and have many graphics and the grid has a lot of traffic. But typical usage is
5 to 15 minutes for the average multiple-manual translation.
Plans
for
the
Future
The system is scalable. In addition to new writers, we are opening the system up to
engineering authors.
For example, our Verification IP group creates testbench IP that exercises tests of IC buses
and interfaces. Protocols for these interfaces are meticulous and arcane. Documentation for
these products is detailed and tedious and constitutes thousands of pages. We are slowly
setting up engineering authors and technical experts to author directly in our FrameMaker
source files. They not only author topic sections, but they also add embedded comments for
their writers to resolve. Since they invariably have prior knowledge of the Perforce system, the
ramp up for authoring is quick. We plan to continue to roll out this methodology to other
engineers.
Other
Issues
Some issues are beyond the scope of this paper, including:
• Locks
Our organizational setup precludes collisions—writers tend to work on separate
manuals and chapters. When collisions are possible—for example, when a writer works
with an engineering author—the team members agree to lock files when they are
checked out. Although not currently necessary, we might consider system locks with
checkouts. With FrameMaker, DIFFs are more difficult than with text files. However,
FrameMaker does have a document
compare
facility, which makes resolving document
collisions a simple, albeit manual process.
• Triggers
Using a trigger to wake up the pubs4d
daemon is probably the way to go. But, we opted
to have a periodically- waking, sleeping daemon for various innocuous reasons.
• Promotion
Generating the InfoHub targets is only part of the process of delivering documentation
to the development software location, which also gets built into the distribution software.
Secondary processes shape the deliverable documentation targets. These are baked
into a multipurpose Perl script we call pubs4.
This script “promotes” the dev
image to the release
image. Along the way, it scans the
HTML and creates support files for the GUIs so Help buttons on dialogs link properly to
the corresponding topics in the documentation. Generated GUI files also include text
extracted from tables in documents that are displayed by hover
help. This process helps
11. 1
1
A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
keep GUI prompts totally consistent with the documentation. For example, error
messages from a Messages
Reference
expand the terse information returned by the
software.
• Other facilities
The pubs4
utility is indeed multipurpose. In addition to promotion, the script can be used
to generate a final release version of the documentation for a product family. This image
is ready to import to the award-winning Mentor Graphics SupportNet® customer support
web site.
As mentioned above, the pubs4
utility also displays the dual-xterm logs that authors use
to monitor the progress of document generation.
Conclusion
The DVT Technical Publications group at Mentor Graphics Corporation developed a “wrapper”
for the document generation facility supplied by the Technical Publications Support group.
Eventually—since our engineering teams use Perforce—we incorporated the wrapper into a
Perforce-based documentation source control methodology. This system has been in
operation more than 6 years.
Along the way, the system has evolved. Old Solaris-based utilities were replaced by docgen, a
cloud-based multiprocessing utility. The corresponding speedup was on the order of 10X. Plus,
the sanity-checking code was replaced by much more sophisticated checking internal to the
docgen
utility.
Additional features include hypertext link checking and GUI data extraction.
Perforce is the backbone of the system. It offers a simple mechanism for checking documents
out and in. Interfacing with Perforce and the document depots through Perl is easy and
efficient. The visual Perforce application (P4V) provides an elegant interface for writers and
other authors to use as a cockpit for their documentation management tasks.
Off-loading usually-manual processes to an automated under-the-hood methodology frees our
authors from the tedious process of preparing documentation targets for the distribution
software. Instead of performing this process at the end of a long release cycle, we create a
correct-by-construction documentation image ready to go at the “drop of a hat.”
This image is also integrated into nightly builds so developers can see relevant portions of the
documentation in “real time” rather than waiting for some end game to finish. This
documentation image is metaphorically a “living document,” which evolves with the software
and reflects dynamic information such as comments from development and field engineers.
Such a system has freed our authors to do what we do best—write.
References
[1] Documentation Processes at Mentor Graphics, internal document, Mentor Graphics Corp.
(2012).
[2] Perforce 2013.1 P4 User’s Guide, Perforce Software (2013).
12. 1
2
A Perforce-Based Automatic Documentation Generation System
This is text for annotations in footer. Similar to footnotes treatment.
[3] Chris Shaw, Questa CDC User Guide V10.2, Mentor Graphics Corp. (2013).
Mentor Graphics® and Questa® are registered trademarks of Mentor Graphics Corporation.
InfoHub™ is a trademark of Mentor Graphics Corporation. Adobe® and FrameMaker® are
registered trademarks of Adobe Systems Inc.
Perforce™ is a trademark of Perforce Software.