TranslateProject/sources/tech/20200108 Automating the creation of research artifacts.md

233 lines
10 KiB
Markdown
Raw Normal View History

[#]: collector: (lujun9972)
[#]: translator: ( )
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Automating the creation of research artifacts)
[#]: via: (https://opensource.com/article/20/1/automating-documentation)
[#]: author: (Kiko Fernandez-Reyes https://opensource.com/users/kikofernandez)
Automating the creation of research artifacts
======
A simple way to automate generating source code documentation, creating
HTML and PDF versions of user documentation, compiling a technical
(research) document to PDF, generating the bibliography, and
provisioning virtual machines.
![Files in a folder][1]
In my work as a programming language researcher, I need to create [artifacts][2] that are easy to understand and well-documented. To make my work easier, I found a simple way to automate generating source code documentation, creating HTML and PDF versions of user documentation, compiling a technical (research) document to PDF, generating the bibliography, and provisioning of virtual machines with the software artefact installed for ease of reproducibility of my research.
The tools I use are:
* [Make][3] makefiles for overall orchestration of all components
* [Haddock][4] for generating source code documentation
* [Pandoc][5] for generating PDF and HTML files from a Markdown file
* [Vagrant][6] for provisioning virtual machines
* [Stack][7] for downloading Haskell dependencies, compiling, running tests, etc
* [pdflaTeX][8] for compiling a LaTeX file to PDF format
* [BibTeX][9] for generating a bibliography
* [Zip][10] to pack everything and get it ready for distribution
I use the following folder and file structure:
```
├── Makefile
├── Vagrantfile
├── code
  └── typechecker-oopl (Project)
      ├── Makefile
      └── ...
├── documentation
  ├── Makefile
  ├── README.md
  ├── assets
   ├── pandoc.css (Customised CSS for Pandoc)
   └── submitted-version.pdf (PDF of your research)
  └── meta.yaml
├── research
  ├── Makefile
  ├── ACM-Reference-Format.bst
  ├── acmart.cls
  ├── biblio.bib
  └── typecheckingMonad.tex
```
The Makefile glues together the output from all of the tools listed above. The **code** folder contains the source code of the tool/language I created. The **documentation** folder contains a Makefile that has instructions on how to generate PDF and HTML versions of the user instructions, located in the README.md file. I generate the PDF and HTML user documentation using Pandoc. The **assets** are simply the CSS style to use and a PDF of my research article that will be hyperlinked from the user-generated documentation, so that it is easy to follow. **meta.yaml** contains meta instructions for generating the user documentation, used by Pandoc for e.g., for author names. The **research** folder contains my research article in LaTeX format, but it could hold any other technical document.
As you can see in the structure, I have a [Makefile][11] for each folder to decouple each Makefile's responsibility and keep a (somewhat) maintainable design. Here is an overview of the top-level Makefile, which orchestrates generating the user documentation, research paper, bibliography, documentation from source code, and provisioning of a virtual machine.
```
all: doc gen
doc:
        make -C $(DOC_SRC) $@
        make -C $(CODE_PATH) $@
        make -C $(RESEARCH)
gen:
        # Creation of folder with artefact, empty at the moment
        mkdir -p $(ARTEFACT_FOLDER)
        # Moving user documentation to artefact folder
        cp $(DOC_SRC)/$(README).pdf $(ARTEFACT_FOLDER)
        cp $(DOC_SRC)/$(README).html $(ARTEFACT_FOLDER)
        cp -r $(DOC_SRC)/$(ASSETS) $(ARTEFACT_FOLDER)
        # Moving research article to artefact folder
        cp $(RESEARCH)/$(RESEARCH_PAPER).pdf $(ARTEFACT_FOLDER)/$(ASSETS)/submitted-version.pdf
        # Moving code and autogenerated doc to artefact folder
        cp -r $(CODE_PATH) $(ARTEFACT_FOLDER)
        cd $(ARTEFACT_FOLDER)/$(CODE_SRC)
        $(STACK)
        cd ../..
        rm -rf $(ARTEFACT_FOLDER)/$(DOC_SRC)
        mv $(ARTEFACT_FOLDER)/$(CODE_SRC)/$(HADDOCK) $(ARTEFACT_FOLDER)/$(DOC_SRC)
        # zip it!
        zip $(ZIP_FILE) $(ARTEFACT_FOLDER)
update:
        vagrant up
        vagrant provision
clean:
        rm -rf $(ARTEFACT_FOLDER)
.PHONY: all clean doc gen update
```
First, the **doc** target generates the user documentation using Pandoc, then it uses Haddock to generate the documentation from the Haskell library source code, and finally, it creates a PDF from the LaTeX file. As depicted in the image below, the generated user documentation is in HTML and CSS. The user documentation contains links to the generated source code documentation, also in HTML and CSS, and to the technical (research) paper . The generated source code documentation links directly to the source code, in case the reader would like to understand the implementation.
![Artifact automation structure][12]
The user documentation is generated with the following Makefile:
```
DOCS=README.md
META=meta.yaml
NUMBER_SECTION_HEADINGS=-N
.PHONY: all doc clean
all: doc
doc: $(DOC)
        pandoc -s $(META) $(DOCS) --listings --pdf-engine=xelatex -c assets/pandoc.css -o $(DOCS:md=pdf)
        pandoc -s $(META) $(DOCS) --self-contained -c assets/pandoc.css -o $(DOCS:md=html)
clean:
        rm $(DOCS:md=pdf) $(DOCS:md=html)
```
To generate documentation from Haskell code, I use this other Makefile, which makes use of Stack to compile the library and download dependencies, and Haddock (inside its OPTS, or options) to create documentation in HMTL:
```
OPTS=exec -- haddock --html --hyperlinked-source --odir=docs
doc:
        stack $(OPTS) src/Initial/AST.hs src/Initial/Typechecker.hs \
        src/Reader/AST.hs src/Reader/Typechecker.hs \
        src/Backtrace/AST.hs src/Backtrace/Typechecker.hs \
        src/Warning/AST.hs src/Warning/Typechecker.hs \
        src/MultiError/AST.hs src/MultiError/Typechecker.hs \
        src/PhantomFunctors/AST.hs src/PhantomFunctors/Typechecker.hs \
        src/PhantomPhases/AST.hs src/PhantomPhases/Typechecker.hs \
        src/Applicative/AST.hs src/Applicative/Typechecker.hs \
        src/Final/AST.hs src/Final/Typechecker.hs
.PHONY: doc
```
I compile the research paper from LaTeX to PDF with this simple Makefile:
```
.PHONY: research
research:
        pdflatex typecheckingMonad.tex
        bibtex typecheckingMonad
        pdflatex typecheckingMonad.tex
        pdflatex typecheckingMonad.tex
```
The virtual machine (VM) relies on Vagrant and the Vagrantfile, where I can write all the commands to set up the VM. The one thing that I do not know how to automate is moving all of the documentation, once it is generated, into the VM. If you know how to transfer the file from the host machine to the VM, please share your solution in the comments. That means that, currently, I manually enter in the VM and place the documentation in the Desktop folder.
```
# All Vagrant configuration is done below. The "2" in Vagrant.configure
# configures the configuration version (we support older styles for
# backwards compatibility). Please don't change it unless you know what
# you're doing.
Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.ssh.username = "vagrant"
  config.ssh.password = "vagrant"
  config.vm.provider "virtualbox" do |vb|
    # Display the VirtualBox GUI when booting the machine
    vb.gui = true
    # Customize the amount of memory on the VM:
    vb.memory = "2048"
    vb.customize ["modifyvm", :id, "--vram", "64"]
  end
  config.vm.provision "shell", inline: <<-SHELL
    ## Installing dependencies, comment after this has been done once.
    # sudo apt-get update -y
    # sudo apt-get install ubuntu-desktop -y
    # sudo apt-get install -y build-essential linux-headers-server
    # echo 'PATH="/home/vagrant/.local/bin:$PATH"' >> /home/vagrant/.profile
    ## Comment and remove the folder sharing before submission
    mkdir -p /home/vagrant/Desktop/TypeChecker
    cp -r /vagrant/artefact-submission/* /home/vagrant/Desktop/TypeChecker/
    chown -R vagrant:vagrant /home/vagrant/Desktop/TypeChecker/
  SHELL
end
```
With this final step, everything has been wired. You can see one example of the result [in HTML][13] and [in PDF][14]. I have created a [GitHub repo with all the source code][15] for ease of study and reproducibility.
I have used this setup for two conferences—the European Conference on Object-Oriented Programming (ECOOP) and the International Conference on Software Language Engineering (SLE), where we won (in both) the Disguinshed Artifact Award.
Pinterest software engineer Baogang Song tells us about Pinrepo, Pinterest's open source solution...
--------------------------------------------------------------------------------
via: https://opensource.com/article/20/1/automating-documentation
作者:[Kiko Fernandez-Reyes][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/kikofernandez
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/files_documents_paper_folder.png?itok=eIJWac15 (Files in a folder)
[2]: https://en.wikipedia.org/wiki/Artifact_%28software_development%29
[3]: https://en.wikipedia.org/wiki/Make_%28software%29
[4]: https://www.haskell.org/haddock/
[5]: https://pandoc.org/
[6]: https://www.vagrantup.com/
[7]: https://docs.haskellstack.org/en/stable/README/
[8]: https://linux.die.net/man/1/pdflatex
[9]: http://www.bibtex.org/
[10]: https://linux.die.net/man/1/zip
[11]: https://opensource.com/article/18/8/what-how-makefile
[12]: https://opensource.com/sites/default/files/uploads/makefile_pandoc_haddock.png (Artifact automation structure)
[13]: https://www.plresearcher.com/files/monadic-typechecker/README.html
[14]: https://www.plresearcher.com/files/monadic-typechecker/README.pdf
[15]: https://github.com/kikofernandez/pandoc-examples/tree/master/artefact-creation