“Science-in-the-open” [1] has the potential to redefine the traditional scientific framework and greatly accelerate high-impact research. It usually involves a large team of volunteer researchers asynchronously working together in a public Discord server. This approach allows one to harness crowd-sourced intelligence to tackle ambitious problems collaboratively and attract specialized expertise and creative approaches you might not find in a traditional lab.

Unfortunately, in practice, the benefits are often outweighed by management challenges. With an open invite, you’ll likely have a heterogeneous group of contributors: different backgrounds, skill levels, time zones, and commitment levels. Coordinating a larger, decentralized team is hard. There’s a risk of wasting time onboarding people who later disappear, or trying to mentor enthusiastic novices at the expense of research progress. Newcomers may find it impossible to figure out the project status or how to contribute amidst the chatter. Progress can stall as volunteers are under no obligation to stay or finish tasks.

Below, I outline strategies that have worked in my experience to optimally steer open science collaborations. Our MindEye papers [2, 3] implemented these strategies for successful publication in NeurIPS and ICML, and we are now introducing the below structure into all projects conducted in our MedARC Discord server.

Summary

In my opinion, the following strategies promote the best open science collaborations:

I’ll dive into each of these points below, then conclude with some optimistic thoughts on MedARC’s future role to support open science collaborations.

Keep the Codebase Simple and Flat

For open collaborations, it’s important to maintain an interpretable and lightweight codebase. Your repository should be as welcoming as possible to a newcomer who might be browsing it to understand the project. A flat code structure [4] with minimal dependencies seems to work best.