On open collaborative proposal writing

This is a debriefing rant about the open collaborative writing of the OpenDreamKit proposal, mostly as notes for self (=Nicolas) for future occasions: how to approach it, what tools to use, some tricks and pitfalls to avoid, …

Context

The OpenDreamKit proposal was written collaboratively by a diverse team across Europe and science. Have fun watching this video, produced with Gource, of the activity on our collaborative work space on github. A pretty good illustration – especially during the final sprint – of how a dispersed but dedicated team can collaborate efficiently with modern tools! You may enjoy as well reading this fun blog post.

Stand by your dreams

Write a proposal that you and all participants believe is right.

Engage the community

The main point of writing this proposal in the open was to get feedback from the community. This allowed for example to rework early on some phrasing that could have been misinterpreted.

Yet we did not get as much feedback as I was hoping for. This is probably because a proposal is a massive amount of information. So it’s hard for an outsider to make one’s way through it. In particular because quite some of the information is not directly relevant to the community, and because the overall structure will tend to remain messy for a long time. Here are some tips that might help next time.

Engage your team

Easier to say than to do :-)

In short: run forward and trust your team mates: they have talent.

This blog post gave me an interesting perspective from the participant side. The key thing is that it’s both hard and intimidating to get involved.

For the technical side, see the next section.

Collaborative tools

Using the right tools is critical for productive collaborative proposal writing. In case you have a doubt, watch the video of the activity on our repository.

Some participants won’t be familiar with the tools, and may push toward basic solutions like dropbox+word. This is NOT an option when 20 people hack simultaneously on the same files. Stay firm, train them, and work hard to keep the technical barrier low.

(for this particular proposal, having some technical barrier was actually not such a bad thing: I wanted to select tech savvy people anyway).

Use a good version control system (e.g. git)

A decentralized version control system (like git or mercurial) has a key advantage of offering massive backups.

Having a web interface to the repository is nice for people to easily browse the information and do small edits.

github did the job to host our repository. It was ok since we had only public information (no private info on the cloud, please!). There are of course alternatives (e.g. gitlab based).

Use a public repository

Be liberal in giving write access. People tend to be shy by default anyway; and if accidently someone would go to far it’s always possible to revert.

Use a good document writing infrastructure (e.g. LaTeX)

The proposal document contains a lot of redundant information (e.g. gantt chart, lists of tasks/deliverables/…, involvement measures, budget, …). It is vital, especially for the final rush, to have some appropriate infrastructure that builds automatically all of those from duplication free and well located semantic information.

LaTeX and the proposal style file is a good technical option.

Choose such an infrastructure, make the compilation procedure completely automatic (make, with typically two variants to build either the draft version with comment or the final version). Get an expert on board to help, and push the infrastructure to its limits.

We went quite far in this direction. Next time I would want to go even further and also build the external budget files automatically from the semantic information in the proposal (not yet implemented).

There is one difficulty here: building the budget files requires to store some information that is private, like the salary of the participants. A good approach would be to have a secondary private git repository with read-write access granted only to, e.g., site leaders and staff. The private info probably could take the form of a single table like latex file associating info like salary to participants. The infrastructure shall allow to build the proposal itself with or without the private info.

Document building and version control

Usual tip: don’t put automatically produced files (e.g. pdf’s) under version control to avoid conflicts.

Exception: if you use like us the web repository browser as web site for the proposal, you may want to commit from time to time the proposal’s pdf to make it accessible to everybody. That’s alright, but do it under a specific name, so that people don’t conflict with it when they build the pdf locally. See e.g. the final dependency in our Makefile.

Finding a nice acronym

There are many acronym generators online that can be quite helpful in the process. But at the end of the day it’s all about creativity; get everyone involved. In our case we were rather stuck until Paul came up with the serendipitous idea to use “Digital” rather than “Virtual” (Dream sounds better than Vream :-) ).

Communication

Meetings

The meeting we had in September was fundamental to discuss the needs and shape the proposal. A bit earlier in the process (e.g. 6 months before) would have been good. A bit longer (3 days instead of 2) could have helped too. At the end of the meeting, there should be:

Having a second meeting one or two months before the submission would have been helpful to launch the final rush, brainstorm on the texts. We mitigated this with online chats.

Online video chats

Online video chats were extremely useful, especially for brainstorming with 2-3 people on sensitive pieces of text (intro to the proposal, work package descriptions, …), for getting to know each other, and for fast distribution of tasks at the end. Use them early on and regularly.

We used intensively appear.in for this. It can’t get simpler: just share a URL (we used http://appear.in/vre-math) with your collaborators and you are up and running. It worked for us with chrome, safari, firefox. It’s peer to peer, so in term of privacy it should be alright.

A limitation: if one of the person had a low bandwidth, this seemed to affect everybody in the chat room; turning of the video usually helped.

When brainstorming on chunks of text we wanted to see the evolving text simultaneously. Git was not enough for this. So we used temporary google documents. Not very satisfying. I am still looking for a good peer-to-peer collaborative plain text editor.

Mailing list

We used a private mailing list. I believe that this helped people speak openly on more sensitive matters. But that’s debatable. Still we were liberal in including interested external people to get feedback.

Most of the communication went on this mailing list; roughly ~3000 e-mails; with 400 of them in the last two days; crazy …

The good thing was that everybody could see what was going on. That’s good in particular at the beginning, when there is little action and it’s important to attract attention and to show off that the project is alive. Having an archive is important too so that anyone can dig for information. You can’t get this with private discussions.

However this went really overboard at the end: for all the participants that were partially involved, sorting through all the information to find the one that was relevant to them was hard. Many of them spent as much time reading through the e-mails as really acting. This is totally unproductive.

I don’t have a good solution. I was trying to make it clear at the top of each e-mail who was directly involved (dear XXX); however this was clearly not sufficient. A step could be to setup some official keyword conventions for the subject line, to let people sort them efficiently. Something like:

H2020: WP leaders: ...
H2020: WP HPC: ...
H2020: Site leaders: ...
H2020: All participants: ...

Or maybe use some other communication media than mailing lists.

TODO list

It must be trivial for each participant to find out what (s)he could do for the project at any given point. The \TOWRITE{XXX}{...} lines in the latex sources are natural spots to write this information.

However extracting and synthesizing this information is necessary to really engage people (they won’t git pull the latest version and run grep by themselves unless they are already seriously engaged).

Sending the synthesis list by e-mail did not work that well. Instead it should be posted on the web. We were using initials (and keywords for tasks that could be handled by several persons in a group). First names probably would have worked better. In any cases, the list of names / keywords should be standardized.

Blog H2020

<