sources/tech/20200108 Automating the creation of research artifacts.md
10 KiB
Automating the creation of research artifacts
A simple way to automate generating source code documentation, creating HTML and PDF versions of user documentation, compiling a technical (research) document to PDF, generating the bibliography, and provisioning virtual machines.
In my work as a programming language researcher, I need to create artifacts that are easy to understand and well-documented. To make my work easier, I found a simple way to automate generating source code documentation, creating HTML and PDF versions of user documentation, compiling a technical (research) document to PDF, generating the bibliography, and provisioning of virtual machines with the software artefact installed for ease of reproducibility of my research.
The tools I use are:
- Make makefiles for overall orchestration of all components
- Haddock for generating source code documentation
- Pandoc for generating PDF and HTML files from a Markdown file
- Vagrant for provisioning virtual machines
- Stack for downloading Haskell dependencies, compiling, running tests, etc
- pdflaTeX for compiling a LaTeX file to PDF format
- BibTeX for generating a bibliography
- Zip to pack everything and get it ready for distribution
I use the following folder and file structure:
├── Makefile
├── Vagrantfile
├── code
│ └── typechecker-oopl (Project)
│ ├── Makefile
│ └── ...
│
├── documentation
│ ├── Makefile
│ ├── README.md
│ ├── assets
│ │ ├── pandoc.css (Customised CSS for Pandoc)
│ │ └── submitted-version.pdf (PDF of your research)
│ └── meta.yaml
│
├── research
│ ├── Makefile
│ ├── ACM-Reference-Format.bst
│ ├── acmart.cls
│ ├── biblio.bib
│ └── typecheckingMonad.tex
The Makefile glues together the output from all of the tools listed above. The code folder contains the source code of the tool/language I created. The documentation folder contains a Makefile that has instructions on how to generate PDF and HTML versions of the user instructions, located in the README.md file. I generate the PDF and HTML user documentation using Pandoc. The assets are simply the CSS style to use and a PDF of my research article that will be hyperlinked from the user-generated documentation, so that it is easy to follow. meta.yaml contains meta instructions for generating the user documentation, used by Pandoc for e.g., for author names. The research folder contains my research article in LaTeX format, but it could hold any other technical document.
As you can see in the structure, I have a Makefile for each folder to decouple each Makefile's responsibility and keep a (somewhat) maintainable design. Here is an overview of the top-level Makefile, which orchestrates generating the user documentation, research paper, bibliography, documentation from source code, and provisioning of a virtual machine.
all: doc gen
doc:
make -C $(DOC_SRC) $@
make -C $(CODE_PATH) $@
make -C $(RESEARCH)
gen:
# Creation of folder with artefact, empty at the moment
mkdir -p $(ARTEFACT_FOLDER)
# Moving user documentation to artefact folder
cp $(DOC_SRC)/$(README).pdf $(ARTEFACT_FOLDER)
cp $(DOC_SRC)/$(README).html $(ARTEFACT_FOLDER)
cp -r $(DOC_SRC)/$(ASSETS) $(ARTEFACT_FOLDER)
# Moving research article to artefact folder
cp $(RESEARCH)/$(RESEARCH_PAPER).pdf $(ARTEFACT_FOLDER)/$(ASSETS)/submitted-version.pdf
# Moving code and autogenerated doc to artefact folder
cp -r $(CODE_PATH) $(ARTEFACT_FOLDER)
cd $(ARTEFACT_FOLDER)/$(CODE_SRC)
$(STACK)
cd ../..
rm -rf $(ARTEFACT_FOLDER)/$(DOC_SRC)
mv $(ARTEFACT_FOLDER)/$(CODE_SRC)/$(HADDOCK) $(ARTEFACT_FOLDER)/$(DOC_SRC)
# zip it!
zip $(ZIP_FILE) $(ARTEFACT_FOLDER)
update:
vagrant up
vagrant provision
clean:
rm -rf $(ARTEFACT_FOLDER)
.PHONY: all clean doc gen update
First, the doc target generates the user documentation using Pandoc, then it uses Haddock to generate the documentation from the Haskell library source code, and finally, it creates a PDF from the LaTeX file. As depicted in the image below, the generated user documentation is in HTML and CSS. The user documentation contains links to the generated source code documentation, also in HTML and CSS, and to the technical (research) paper . The generated source code documentation links directly to the source code, in case the reader would like to understand the implementation.
The user documentation is generated with the following Makefile:
DOCS=README.md
META=meta.yaml
NUMBER_SECTION_HEADINGS=-N
.PHONY: all doc clean
all: doc
doc: $(DOC)
pandoc -s $(META) $(DOCS) --listings --pdf-engine=xelatex -c assets/pandoc.css -o $(DOCS:md=pdf)
pandoc -s $(META) $(DOCS) --self-contained -c assets/pandoc.css -o $(DOCS:md=html)
clean:
rm $(DOCS:md=pdf) $(DOCS:md=html)
To generate documentation from Haskell code, I use this other Makefile, which makes use of Stack to compile the library and download dependencies, and Haddock (inside its OPTS, or options) to create documentation in HMTL:
OPTS=exec -- haddock --html --hyperlinked-source --odir=docs
doc:
stack $(OPTS) src/Initial/AST.hs src/Initial/Typechecker.hs \
src/Reader/AST.hs src/Reader/Typechecker.hs \
src/Backtrace/AST.hs src/Backtrace/Typechecker.hs \
src/Warning/AST.hs src/Warning/Typechecker.hs \
src/MultiError/AST.hs src/MultiError/Typechecker.hs \
src/PhantomFunctors/AST.hs src/PhantomFunctors/Typechecker.hs \
src/PhantomPhases/AST.hs src/PhantomPhases/Typechecker.hs \
src/Applicative/AST.hs src/Applicative/Typechecker.hs \
src/Final/AST.hs src/Final/Typechecker.hs
.PHONY: doc
I compile the research paper from LaTeX to PDF with this simple Makefile:
.PHONY: research
research:
pdflatex typecheckingMonad.tex
bibtex typecheckingMonad
pdflatex typecheckingMonad.tex
pdflatex typecheckingMonad.tex
The virtual machine (VM) relies on Vagrant and the Vagrantfile, where I can write all the commands to set up the VM. The one thing that I do not know how to automate is moving all of the documentation, once it is generated, into the VM. If you know how to transfer the file from the host machine to the VM, please share your solution in the comments. That means that, currently, I manually enter in the VM and place the documentation in the Desktop folder.
# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
# you're doing.
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/trusty64"
config.ssh.username = "vagrant"
config.ssh.password = "vagrant"
config.vm.provider "virtualbox" do |vb|
# Display the VirtualBox GUI when booting the machine
vb.gui = true
# Customize the amount of memory on the VM:
vb.memory = "2048"
vb.customize ["modifyvm", :id, "--vram", "64"]
end
config.vm.provision "shell", inline: <<-SHELL
## Installing dependencies, comment after this has been done once.
# sudo apt-get update -y
# sudo apt-get install ubuntu-desktop -y
# sudo apt-get install -y build-essential linux-headers-server
# echo 'PATH="/home/vagrant/.local/bin:$PATH"' >> /home/vagrant/.profile
## Comment and remove the folder sharing before submission
mkdir -p /home/vagrant/Desktop/TypeChecker
cp -r /vagrant/artefact-submission/* /home/vagrant/Desktop/TypeChecker/
chown -R vagrant:vagrant /home/vagrant/Desktop/TypeChecker/
SHELL
end
With this final step, everything has been wired. You can see one example of the result in HTML and in PDF. I have created a GitHub repo with all the source code for ease of study and reproducibility.
I have used this setup for two conferences—the European Conference on Object-Oriented Programming (ECOOP) and the International Conference on Software Language Engineering (SLE), where we won (in both) the Disguinshed Artifact Award.
Pinterest software engineer Baogang Song tells us about Pinrepo, Pinterest's open source solution...
via: https://opensource.com/article/20/1/automating-documentation
作者:Kiko Fernandez-Reyes 选题:lujun9972 译者:译者ID 校对:校对者ID