The global pandemic presented new challenges and opportunities for organizing conferences, and OHBM 2021 was no exception. The OHBM Brainhack is an event that occurs just prior to the OHBM meeting, typically in-person, where scientists of all levels of expertise and interest gather to work and learn together for a few days in a collaborative hacking-style environment on projects of common interest (1). Building off the success of the OHBM 2020 Hackathon (2), the 2021 Open Science Special Interest Group came together online to organize a large coordinated Brainhack event that would take place over the course of 4 days. The OHBM 2021 Brainhack event was organized along two guiding principles, providing a highly inclusive collaborative environment for interaction between scientists across disciplines and levels of expertise to push forward important projects that need support, also known as the “Hack-Track” of the Brainhack. The second aim of the OHBM Brainhack is to empower scientists to improve the quality of their scientific endeavors by providing high-quality hands-on training on best practices in open-science approaches. This is best exemplified by the training events provided by the “Train-Track” at the OHBM 2021 Brainhack. Here, we briefly explain both of these elements of the OHBM 2021 Brainhack, before continuing on to the Brainhack proceedings.
At the beginning of the hackathon, scientists propose projects to work on, where each presenter has a few minutes to discuss the project and what they hope to accomplish with this project over the timeframe of the hackathon and beyond (See (3) for a list of projects from OHBM 2021). Projects can vary widely in topic and scope (e.g., coding, writing, documentation, and community guidelines), and participants can join a single project or split their time between multiple projects, depending on what suits them best. For example, a project may be on a topic they are already knowledgeable about, or instead, it may be a topic in which they want to gain some expertise. Projects may involve novel methods or programming languages that may be useful for other projects that participants would like to work on, or the project may have a team that they simply want to get to know and to have the opportunity to work with. Whatever the reason for joining a project, all contributions are welcome. Over the course of 4 days, the 2021 hackathon pushed forward 24 projects with a collaboration of 200 scientists from 28 countries spread across Africa, Asia-Pacific, Europe, Middle East, North America, and South America. We deployed a variety of technologies to facilitate exchanges between our participants, despite the considerable challenges due to the diverse geopolitical landscape. In particular, obstacles in accessibility to online platforms were presented for participants in Asia and the Middle East. Sparkle was used as our main interface and hosting environment, in line with the main OHBM conference (4). Crowdcast (5), Zoom Webinars (6), Youtube (7), and DouYu (8) provided the technical backbone for streaming and session recordings. Gather.town and Sparkle ensured customized experience for the social gatherings.
The intense development work during the Brainhack was complemented and supported by the train-track, which provided opportunity for acquiring and honing transferable skills in a structured format to improve the quality of their scientific work. Scientists with diverse skill sets participated in many hands-on training sessions and submitted 110 feedback forms across a range of topics we hosted, including version control, code testing, reproducible workflows, neuroimaging data visualization, machine learning, and community building in open science. The train-track also provided space to create awareness for diversity, equity, and inclusivity. The train-track was composed of three components. First, we provided prerecorded high-quality mini-lectures sourced from the community to cover the following range of topics: machine learning, data visualization, version control, code testing, reproducible workflows, and community building. Second, for each set of mini-lectures, attendees could also sign up for live “question and answer” sessions with experts on each topic, as well as hands-on sessions, where the participants, organized in small groups, could apply the principle of “learning by doing” guided by a dedicated expert (9). Hands-on sessions were led by early career researchers that were identified as open-science fellows. These fellows applied earlier in the year and were given stipends in recognition of their contributions to open science. These sessions provided ample training opportunities for young scientists, especially from countries where these types of training are not easily accessible. Third, we also worked with the OHBM BrainArt Special Interest Group to provide prerecorded videos on a range of brain art projects and a live session to discuss brain art. For more information on the train-track and to use the tutorials, we provided, see: https://ohbm.github.io/hackathon2021/traintrack/.
Here, we present the OHBM 2021 Brainhack proceedings. There are 13 projects below.
Brain QR Modem
Are you ever annoyed how hard it is to get brain data of the scanner? The fact that scanners usually contain private information about patients and are thus embedded in maximally restrictive clinical cyber-security network environments, makes it quite complicated to get access to the data. This is especially true when visiting collaborative sites. In this hackathon project, we aimed to develop a purely unidirectional (safe) data streaming “hack” to transfer magnetic resonance imaging (MRI) data directly to the cloud via dynamic QR codes.
Our setup is inspired by the early days of the Internet. In the early days, a modem (modulator-demodulator) was used to (i) convert digital information into audio streams, (ii) transfer them across telephone lines, and (iii) convert them back into the digital domain. Here, we aim to do the same thing with pixel data of MRI scans. However, instead of audio signals, we will use machine-readable visual information: QR codes (1).
Specific aims of the Brain QR modem:
1.We aim to develop a SIEMENS ICE (Image Calculation Environment) functor that converts pixel data into QR-code streams.
2.We aim to modify an existing Android app (2) that converts the streamed QR codes into a series of png’s to be stored in the cloud.
What was achieved
We developed a new ICE functor (modulator units) to convert MR pixel data into QR codes. It can be included in any custom reconstruction chain of sequences on SIEMENS scanners and was tested for the specific setup of a single slice turbo-FLASH sequence on a MAGNETOM 7T scanner running VB17 software. We developed frequency controllers that allow adjustable QR-code refresh rates that match the capabilities of the used smartphone cameras.
Currently, the modulator solely transmits pixel data and their dimensions. This means that personally identifiable information is removed from the data so that they can be transmitted and stored with minimal privacy concerns.
In future Brainhack events, we will further optimize the brain QR modem setup. Specifically, we will include LayNii (3) capabilities into the ICE functor to send pixel data directly in the nii file format. Furthermore, we need to augment the TXQR tester android app (2) to receive QR-code streams beyond the png standard. Namely, converting it to a multislice receiver without the need of storing all data in large mosaics.
More information and video introductions in (4,5).
We thank Scannexus B.V. for kind scan time support of testing the functor in real life (A0178).
1. Nayuki, QR-Code-generator, Github release v1.6.0, 2021, https://github.com/nayuki/QR-Code-generator/
2. Daniluk Ivan, TXQR, Github, Commit d92929c, 2019, https://github.com/divan/txqr
3. Huber L, Poser BA, Bandettini PA, et al. LAYNII: A software suite for layer-fMRI. Neuroimage. 2021;237:118091. https://doi.org/10.1016/j.neuroimage.2021.118091
4. Huber et al., Brainhack issue, 2021, https://github.com/ohbm/hackathon2021/issues/4
5. Huber et al., Project website, 2021, https://layerfmri.com/brainqr/
NeuroDesk – A Cross-platform Data Analysis Environment for Reproducible Neuroimaging
Neuroimaging researchers require a diverse collection of bespoke command-line and graphical tools to analyze data and answer research questions. Installing and maintaining a neuroimaging software setup is challenging and often results in nonreproducible environments. Package managers and software repositories can help with this and NeuroDebian (2) is a well-known example that drastically improves the neuroscience software distribution. However, a limitation of NeuorDebian is the inability to install neuroscientific software on Linux flavors commonly used on high-performance computing systems or Windows and MacOS. Researchers therefore still struggle to access the required software or to move analyses between different computing platforms due to the setup work required – ultimately limiting interoperability and reproducibility and impeding the broad sharing of analysis pipelines with the community. Container technology, such as Docker (5) or Singularity (3), enables the execution of software on different systems and could aid in distributing scientific software. We aim to develop a platform built on container technology for processing and analyzing neuroimaging data with the goal to lower the barrier of using various neuroimaging software in a reproducible environment.
WHAT WE HAVE FOUND/ACHIEVED DURING THE HACKATHON?
We developed a modular and open analysis environment consisting of a continuous integration system using Github actions to automatically build neuroimaging software containers (3) utilizing neurodocker (4). To enhance reproducibility, we compile toolboxes requiring proprietary software, such as MATLAB, to utilize their respective runtime environments. By providing a separate container for each neuroimaging software package, we allow a fully reproducible environment, while also avoiding dependency conflicts between different tools. We developed wrapper scripts that transparently integrate these containers into any workflow without modifying existing scripts in for example workflow systems like Nipype (1). To keep the system lightweight, users download the containers on demand from either a CVMFS distribution network or object storage locations in the United States, Europe, and Australia. Finally, we offer a lightweight Linux desktop container accessible via a browser interface that runs on any operating system (see Figure 1) and provides an easy to use GUI to lower the barrier for entry (no installation required whatsoever) and to ensure interoperability of the NeuroDesk environment itself.
The version of NeuroDesk after the hackathon was a proof-of-concept that showed the potential of our project, but more work was required to make it user-friendly and robust to enable a wide-spread uptake in the community. We used this NeuroDesk prototype to apply for funding from the Australian Research Data Commons, Oracle for Research, and The University of Queensland and using this support, we developed a mature project, accessible to every researcher worldwide: https://www.neurodesk.org/
Figure 1. A screenshot of a NeuroDesk instance in a Browser (Windows 10, Edge Browser).
The authors acknowledge the facilities and scientific and technical assistance of the National Imaging Facility, a National Collaborative Research Infrastructure Strategy capability, at the Centre for Advanced Imaging, University of Queensland, and at Swinburne Neuroimaging, Swinburne University of Technology.
1. Gorgolewski K, Burns CD, Madison C, et al. Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python. Front Neuroinformatics. 2011;5. doi:10.3389/fninf.2011.00013
2. Halchenko YO, Hanke M. Open is Not Enough. Let’s Take the Next Step: An Integrated, Community-Driven Computing Platform for Neuroscience. Front Neuroinformatics. 2012;6. doi:10.3389/fninf.2012.00022
3. Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific Containers for Mobility of Compute. PLoS ONE. 2017;12(5):e0177459. doi:10.1371/journal.pone.0177459
4. ReproNim/Neurodocker. Center for Reproducible Neuroimaging Computation; 2021. Accessed January 22, 2021. https://github.com/ReproNim/neurodocker
5. Willis J. Docker and the Three Ways of DevOps.:9.
User Research Study for Decentralized Open Science Powered by Opscientia’s Web3 Tech Stack
At Opscientia, we are building an open-science ecosystem powered by Web3 tools; starting with neuroscience. During the event, we ran a user research study to validate our assumptions around academic “pain points” that we are solving and to gather feedback and FAQs on our project and initial website wireframe from this specific user group.
One-on-one user research interviews were carried out with neuroscientists during the event (n=5). Participants were of different career stages and geographical regions and used a range of data types (e.g., EEG, MRI, PET).
We validated “pain points” around data, including: dataset access difficulties, uncertainty around permissions and data protection laws, lack of tools and standards to easily share data, storing multiple versions, difficulty computing on stored data.
We validated “pain points” around academia, including as follows: value is based on journal impact factor as opposed to “the work itself” (e.g., through Github-like repositories), lots of unpaid work, inadequate access to resources (e.g., training).
Feedback on our project: Overall positive (“I think it is great. I am usually very depressed about science for most of the year. Our generation thinks differently and sees what our supervisors are battling. We see it has to work differently to how it is now”, “I think it is amazing. I love the central spirit of the project. I love how our generation is breaking the boundaries”, “Oh my gosh. I had not jumped on the decentralized train just yet…I would love to support this”). With additional points raised (“It sounds a bit over ambitious”, “It will only work once the community is large enough”).
Many FAQs gathered: see (https://opsci.io/faqs/) for questions raised and our answers.
We also gathered feedback on our website wireframe: people generally liked the name and layout and made suggestions for labels/links.
We are expanding on this initial research to interview a wider group of neuroscientists (target n=30) and gather feedback on our prototype design. We then plan to revisit a subgroup of these participants for user testing of our prototype once it is built.
A Jupyter Book of Everything Brainhack: Past, Present, and Future
The Brainhack Jupyter Book is the online companion material of the Brainhack opinion piece published in Neuron (1). The content consists of a glossary on terms related to Brainhack and the theme, location, and date of past Brainhack events. The aim is to update the content as Brainhack develops and evolves. The project involves tasks of varying difficulties, from content contribution via markdown editing to feature and infrastructure maintenance with Jupyterbook and Python. Hence, it makes both a good beginner project to familiarize oneself with the collaborative workflow on GitHub and provides an opportunity to expand and apply web design and software maintenance skills for more advanced programmers. During the OHBM 2021 Brainhack, several participants worked on the project. J.B. aims to learn the collaborative review feature, reviewed one P.R., and started on a pull request to improve the navigation of contributor guidelines by adding collapsible summary to shorten the document. A.N. contributed 16 glossary entries. H.T.W. worked on some maintenance of the repository, reviewed and answered participants’ questions related to the project. The maintenance will continue, and we hope to recruit more regular maintainers and encourage newcomers to participate. The glossary is open for expansion, and a translation initiative is looking for submissions and reviewers. In future Brainhacks, we hope the Brainhack Jupyter Book can be a recurrent project for the different members of the community to gain hands-on experience of collaborative working on GitHub, as well as enrich our records on the Brainhack community.
Towards an Agreed Reporting Template for ERP Methodology – International Standard (ARTEM-IS)
In the EEG “Garden of Forking Paths,” most research reports do not have enough detail to know which path has been taken, hampering reproducibility, replication, and meta-science (1). Our initiative to create an “Agreed Reporting Template” for reporting EEG methods includes a draft template spreadsheet. This template has been designed to report the method of EEG studies, namely ERP studies, thoroughly and accurately by providing fields for researchers to input specific methodological details (2). The rationale for this tool and how you can help to make it better can be found in our preprint (3).
At the time of the hackathon, ARTEM-IS was working toward turning the template into a webapp based on the model of COBIDAS Information Collection Protocol: https://ohbm.github.io/eCOBIDAS/#/. This webapp would have both machine-readable and human-readable outputs that can be treated as supplements to publications and preregistrations. It would be compatible with other standards and guidelines, such as Brain Imaging Data Structure and COBIDAS.
The primary goal of this “hacking” project was to convert the existing template into a format suitable as the backend of the webapp. The secondary goal was to collaborate on further refinement of the template contents.
Over the 3 days of the event and the 2 days of follow-up post-Brainhack meetings, 17 people contributed to the project. Team coordinated efforts through two daily team meetings in a dedicated ARTEM-IS Gathertown Clubroom, with both asynchronous and synchronous working sessions in between.
As a result, the spreadsheet (making up the backend of the app) was expanded from the initial 91 fields available pre-Brainhack to 193 fields. Many of the existing fields were improved through discussion with team members. In the process, 46 issues and suggestions for improvement were raised, 22 of which were settled immediately, while the rest was left to be resolved later. Furthermore, 13 additional items were resolved and new questions were raised during follow-up meetings. Files containing more detailed reports on the progress can be found in the OSF repository of the hack (4), and a link to the up-to-date information on the webapp can be found here: https://github.com/INCF/artem-is
At the time of revising this document (June 2022), we have reached both the short-term future goals: finalizing the first version of the webapp with a machine-readable output and the long-term goals; creating a human-readable output of the webapp for supplementary documents and importing prefilled information.
1. Šoškić, A., Jovanović, V., Styles, S. J., Kappenman, E. S., & Ković, V. (2021). How to do Better N400 Studies: Reproducibility, Consistency and Adherence to Research Standards in the Existing Literature. Neuropsychology Review. https://doi.org/10.1007/s11065-021-09513-4
2. Šoškić, A., Ković, V., Ke, H., & Styles, S. J. (2020). ARTEM-IS: Agreed Reporting Template for EEG Methodology - International Standard. https://doi.org/10.17605/OSF.IO/PVRN6
3. Styles, S. J., Ković, V., Ke, H., & Šoškić, A. (2021). Towards ARTEM-IS: Design Guidelines for evidence-based EEG methodology reporting tools. NeuroImage (245), 118721. https://doi.org/10.1016/j.neuroimage.2021.118721
4. Styles, S. J., Šoškić, A., Ković, V., Ke, H., Gau, R., Pavlov, Y. G., … Yap, D. F. (2021). ARTEM-IS Web App OHBM Brainhack 2021. https://doi.org/10.17605/OSF.IO/ARD3W
Flooding Brains: Mesmerizing and Relaxing Volumetric Brain Animations
Fast-marching methods (1) forms the backbone of many geodesic distance measuring algorithms that are used to compute distances between points on a brain surface. This project was motivated to show the beauty of the fast-marching method, while it is being computed. In addition, this project made use of exotic animal brains from https://braincatalogue.org/ (including an okapi, a giraffe, and a dolphin brain) to bring awareness to the similarities between a human brain and other mammals.
LayNii (2) programs “LN2_GEODISTANCE” and “LN2_IFPOINTS” in combination with PyVista visualization library (3) are used to generate animations that give the impression of the brain surfaces being flooded with colorful liquids. The scripts developed for the animations are made available in a public repository: <https://github.com/ofgulban/flooding_brains>. The resulting animations can be seen at https://youtu.be/XFvYewxzXno (Figure 1). The animations are deemed qualitatively mesmerizing and satisfying. Future projects will build on the animation expertise acquired to create more mesmerizing and satisfying animations. From a societal impact perspective, this project seemed to generate momentary excitement and relief to the Brainhack participants who were watching the virtual event. The authors believe that such light-hearted projects offering entertainment value are needed amidst COVID-19 pandemic. However, we also note that, while being entertaining to the watchers, this project also offered deep insight into an essential modern computer algorithm.
1. Sethian, J. A. (1996). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences of the United States of America, 93(4), 1591–1595. https://doi.org/10.1073/pnas.93.4.1591
2. Huber, L. (Renzo), Poser, B. A., Bandettini, P. A., Arora, K., Wagstyl, K., Cho, S., … Gulban, O. F. (2021). LayNii: A software suite for layer-fMRI. NeuroImage, 118091. https://doi.org/10.1016/j.neuroimage.2021.118091
3. Sullivan, C. B., & Kaszynski, A. A. (2019). PyVista: 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). Journal of Open Source Software, 4(37), 1450. https://doi.org/10.21105/joss.01450
Figure 1. Volume rendered animations showing evolving wavefronts on brain surfaces: https://youtu.be/XFvYewxzXno. The wavefronts are evolving from different points distributed over the surface. The brains shown in order are: an okapi brain (00:00–00:29), a giraffe brain (00:29–01:05), and a dolphin brain (01:05–2.12).
An open Code Pledge for the Neuroscience Community
THE COLLECTIVE ACTION PROBLEM IN ACADEMIA
Neuroscience relies on research code for experimentation, data cleaning, and analysis. If individual researchers share their code openly, the collective neuroscience community benefits through verifiability, greater reproducibility, and less unnecessary duplication (1,2). Most neuroscientists do not share their code because nonarticle research outputs are not currently rewarded in academia and thus individuals may perceive sharing code as a risk to their career (1,3). The global neuroscience community is thus trapped in a collective action problem (4): the community is failing to provide a public good (open source code) due to competing interests at the individual level (5).
A NEW SOLUTION: CONDITIONAL PLEDGES
Project free our knowledge (FOK) aims to solve this class of problems by organizing collective action between researchers. FOK facilitates conditional pledges: commitments to adopt open and reproducible research practices on the condition that N researchers preagree to take action (Figure 1). If the threshold is met, the pledging community takes action together, thus empowering individuals to achieve their common aims. If the threshold is not met, however, no further action is required, thus mitigating any risks associated with solitary action. FOK draws from prior conditional pledge platforms that have helped millions of globally dispersed individuals overcome collective action problems in the economic (e.g., Kickstarter), cultural (e.g., Collaction), and political spheres (e.g. Pledgebank; (6)), and seeks to bring a comparable but tailored solution to the research community.
The open source and citable code pledge
During Brainhack 2021, we developed a conditional pledge campaign that aims to motivate neuroscientists to share their code in a public repository (e.g., Zenodo, OSF) with a persistent identifier (e.g., digital object identifier [DOI]). Coinciding with the present publication, we hereby invite all neuroscientists to take the Open source and citable code pledge:
“I pledge to share the code underlying all of my future publications in a citable repository.”
Pledgers can either begin sharing their code immediately (c.f., (7)) or wait until N neuroscientists have taken the pledge (conditional pledge; see the FOK website for details).
We intend to promote the campaign through various communication channels and strategies (e.g., ambassador network; Figure 1A). If the campaign reaches the threshold, pledgers will be publicly listed and directed to take action together (Figure 1B). We will then analyze the campaign outcomes (e.g., pledge compliance, citation metrics) and use this information to maximize the impact of future campaigns that target a range of open and reproducible research practices in different research fields (Figure 1C,D). In short, we seek to establish a sustained, evidence-based movement for social change in academia (Figure 1E). We hereby invite all researchers to join us in this vision by proposing new campaigns and taking pledges on the FOK website today.
Figure 1. Using conditional pledges to solve collective action problems in academia (e.g., code-sharing in neuroscience). (A) Current state. A minority of neuroscientists share code (orange dots). A larger latent group (4) would be willing to share code if they were not alone in doing so (orange-bordered dots). Ambassador network (solid black lines) promotes the campaign. Researchers sign conditional pledges (dashed lines) to act when a predetermined threshold is met. (B) Campaign reaches the threshold. Pledgers share their code (orange dots). New ambassadors join the network (solid lines). Campaign outcomes are analyzed and used to inform the design of a follow-up campaign (or multiple campaigns). (C) Follow-up campaign is launched. Ambassadors promote the campaign and a larger cohort of researchers sign conditional pledges. (D) Future state. The majority of researchers share code (orange dots), flipping the social norm so that individuals who do not share face social pressure (gray dots). (E) Developing a sustained movement for social change. At any one point, the number of researchers who would be willing to share code if they had peer support (orange line) is greater than the number of researchers currently sharing (orange patch). Each campaign serves to connect this latent group via conditional pledges (dashed lines) and coordinate collective action once the target threshold is reached. Successive campaigns leverage the established community to reach greater thresholds and/or motivate new behaviors (e.g., “FAIR” code; (8–10)). Perceived risk to individuals’ careers is inversely related to the number of people sharing (blue line). Following the Nth campaign, sharing code becomes normative and perceived risks are dramatically reduced (note that campaigns can also evolve in parallel, rather than sequentially as pictured here).
1. Eglen, S. J., Marwick, B., Halchenko, Y. O., Hanke, M., Sufi, S., Gleeson, P., Silver, R. A., Davison, A. P., Lanyon, L., Abrams, M., Wachtler, T., Willshaw, D. J., Pouzat, C., & Poline, J.-B. (2017). Toward standard practices for sharing computer code and programs in neuroscience. Nature Neuroscience, 20(6), 770–773. https://doi.org/10.1038/nn.4550
2. Riquelme, J. L., & Gjorgjieva, J. (2021). Towards readable code in neuroscience. Nature Reviews Neuroscience, 22(5), 257–258. https://doi.org/10.1038/s41583-021-00450-y
3. LeVeque, R. J. (2013, April 1). Top Ten Reasons To Not Share Your Code (and why you should anyway). Siam News. https://sinews.siam.org/Details-Page/top-ten-reasons-to-not-share-your-code-and-why-you-should-anyway
4. Olson, M. (1971). The Logic of Collective Action: Public Goods and the Theory of Groups, Second Printing with a New Preface and Appendix. Harvard University Press. https://doi.org/10.2307/j.ctvjsf3ts
5. Coelho, L. P. (2013, May 6). People are Right Not to Share Scientific Code… but we are wrong to let them get away with it. Meta Rabbit. https://metarabbit.wordpress.com/2013/05/06/people-are-right-not-to-share-scientific-code/
6. Hallam, R. (2016). How the internet can overcome the collective action problem: Conditional commitment designs on Pledgebank, Kickstarter, and The Point/Groupon websites. Information, Communication & Society, 19(3), 362–379. https://doi.org/10.1080/1369118X.2015.1109696
7. Gleeson, P., Davison, A. P., Silver, R. A., & Ascoli, G. A. (2017). A commitment to open source in neuroscience. Neuron, 96(5), 964–965. https://doi.org/10.1016/j.neuron.2017.10.013
8. Chue Hong, N. P., Katz, D. S., Barker, M., Lamprecht, A.-L., Martinez, C., Psomopoulos, F. E., Harrow, J., Castro, L. J., Gruenpeter, M., Martinez, P. A., Honeyman, T., Struck, A., Lee, A., Loewe, A., van Werkhoven, B., Jones, C., Garijo, D., Plomp, E., Genova, F., … Yehudi, Y. (2021). FAIR Principles for Research Software (FAIR4RS Principles). Research Data Alliance, 32. https://doi.org/10.15497/RDA00065
9. Goble, C., Cohen-Boulakia, S., Soiland-Reyes, S., Garijo, D., Gil, Y., Crusoe, M. R., Peters, K., & Schober, D. (2020). FAIR computational workflows. Data Intelligence, 2(1–2), 108–121. https://doi.org/10.1162/dint_a_00033
10. Lamprecht, A.-L., Garcia, L., Kuzak, M., Martinez, C., Arcila, R., Martin Del Pico, E., Dominguez Del Angel, V., van de Sandt, S., Ison, J., Martinez, P. A., McQuilton, P., Valencia, A., Harrow, J., Psomopoulos, F., Gelpi, J. L., Chue Hong, N., Goble, C., & Capella-Gutierrez, S. (2020). Towards FAIR principles for research software. Data Science, 3(1), 37–59. https://doi.org/10.3233/DS-190026
The BIDS Application Standard (BEP027)
An increase in data sharing over recent years has demanded the development and adoption of standards to ensure that published resources are interoperable, accessible, and reusable (10). In neuroimaging, the Brain Imaging Data Structure (BIDS; (1)) has emerged as a community standard to organize and describe datasets that encompasses common use-cases and supports extension alongside the everchanging scientific landscape. Alongside the data standard, BIDS apps (2,3) were developed with the goal of abstracting application-specific idiosyncrasies from tools and simplifying their application on public datasets. However, the BIDS apps structure was never formally standardized, and its extensibility alongside the data standard was nonexistent. This project sought to formalize the BIDS application specification and provide a roadmap for tool developers to migrate their tools from the original BIDS app model to the updated standard. Core to this initiative are the values of descriptiveness, extensibility, and accessibility. Over the course of the OHBM Brainhack, the following changes and extensions of the BIDS apps specification were agreed upon:
1.Rather than defining a rich set of mandatory arguments – that would need updating as the BIDS standard evolves – a naming convention was defined for reserved arguments. While many applications will contain a consistent set of parameters, such as which subjects to analyze, imposing the inclusion of arguments widely would lead to unnecessary bloat in many applications. Instead, all BIDS entities may be mapped to arguments using a defined formula (i.e., <entity_label> or <entity_index>).
2.Rather than being overly prescriptive in the definition of how command-line interfaces are constructed, limiting accessibility, the Boutiques descriptive framework (4) will be used to abstract interfaces from users. This model allows for the flexible construction of tools, while Boutiques descriptors contain the mapping of BIDS-standardized values, such as entity names (e.g., subject, session), to the terms used within each tool. A BIDS application runner will be built as a lightweight wrapper around Boutiques (5,6) to allow researchers to validate and launch potential experiments.
3.Clear and transparent usage reports will be provided from complying applications. The BIDS application runner will record output and error logs, resource consumption, and exit status during tool execution. A set of exit status values have been defined and reserved for BIDS-related failures to further increase transparency.
The specification is being developed openly and in an ongoing fashion. All contributions are welcome through the BIDS specification communication channels and GitHub repository: https://github.com/bids-standard/bids-specification/issues/313.
1. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016).
2. Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044 (2016).
3. Gorgolewski, K. J. et al. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. PLoS Comput Biol 13, e1005209 (2017).
4. Yarkoni, T. et al. PyBIDS: Python tools for BIDS datasets. JOSS 4, 1294 (2019).
5. Glatard, T. et al. Boutiques: a flexible framework to integrate command-line applications in computing platforms. Gigascience 7, (2018).
Sea: A Lightweight Filesystem to Process Large Neuroimaging Data on HPC
Open neuroimaging datasets reached the petabytes scale. Depending on the modality, datasets can consist of huge files, such as the BigBrain (1), or large amounts of reasonably-sized files (MBs to few GBs), such as high-performance computing (HPC) (2), UK BioBank (3), and CoRR (4). Processing large datasets may incur significant data transfer overheads, resulting in the bulk of the processing time attributed to data management. This overhead creates a barrier to analyzing such datasets thoroughly. To alleviate the processing time of large neuroimaging datasets, we created a tool to improve application data management by leveraging different levels of local cache.
HPC clusters are commonly used infrastructures for processing scientific data by researchers, which rely on a shared network-based parallel file system (e.g., Lustre) to store data. Since the filesystem is shared, data transfer performance is not only based on pipeline operation but also collective file system usage.
To facilitate the processing of large-scale neuroimaging pipelines on HPC systems, we have built Sea. This hierarchical file system leverages available caches to offset writes to the shared file system (Figure 1). Sea can be used as an “add-on” to available neuroimaging tools as it intercepts application calls to the file system and redirects them to the most appropriate location.
As the access to compute-local storage is temporary, Sea will only have access to the cache space during processing. However, users may require cached data postprocessing for quality control or future analysis. As a result, Sea provides flushing and eviction capabilities. Flushing allows the user to copy the cached data to the shared file system, whereas eviction frees unneeded data on the cache to increase available cache space for future data. Should a cache reach capacity, it will write to the next level cache in the hierarchy or the shared file system.
For the OHBM Brainhack 2021, we sought to improve the Sea codebase with various new additions. First, we added GitHub Actions for continuous integration leading to a more robust codebase. Second, we extended our GitHub Actions workflows to test Sea on different Linux distributions. Finally, we improved the clarity of the README file to ease usage.
During the hackathon, we looked into integrating glibc test cases within Sea to ensure all file system functionalities are available and began efforts to test Sea with the HCP preprocessing pipeline and fMRIPrep. We will continue these efforts as future work.
1. Amunts, K., Lepage, C., Borgeat, L., Mohlberg, H., Dickscheid, T., Rousseau, M. É., ... & Evans, A. C. (2013). BigBrain: an ultrahigh-resolution 3D human brain model. Science, 340(6139), 1472–1475.
2. Van Essen, D. C., Smith, S. M., Barch, D. M., Behrens, T. E., Yacoub, E., Ugurbil, K., & Wu-Minn HCP Consortium. (2013). The Wu-Minn human connectome project: an overview. Neuroimage, 80, 62–79.
3. Alfaro-Almagro, F., Jenkinson, M., Bangerter, N. K., Andersson, J., Griffanti, L., Douaud, G., … & Smith, S. M. (2018). Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank. NeuroImage, 166, 400–424. https://doi.org/10.1016/j.neuroimage.2017.10.034
4. Zuo, X. N., Anderson, J. S., Bellec, P., Birn, R. M., Biswal, B. B., Blautzik, J., ... & Milham, M. P. (2014). An open science resource for establishing reliability and reproducibility in functional connectomics. Scientific Data, 1(1), 1–13.
CleanBibImpact: Do Papers With Citation Diversity Statements Have More Gender-balanced Reference Lists?
Dworkin and colleagues (1) showed that neuroscience papers over-cite men; there are more citations of men as first and last authors, compared with what we would expect if gender was not involved in who we cite.
To raise awareness of this problem, this group created cleanBib, a tool that creates a citation diversity statement (CDS) for a publication’s reference section (2). These statements indicate how many references are
•man-man (man first author, man last author),
Our goal is to see whether papers with CDSs have more gender-balanced reference lists.
In this hackathon, we
1. Manually collected papers with CDSs and extracted the gendered citation percentages from the statements;
2. Visualized the over- and undercitation of the different author gender groups in reference lists of papers with CDSs, compared with the broader neuroscience literature (as reported in (1); Figure 1A);
3. Created a tool to collect papers and extract the citation percentages; and
4. Visualized the similarity between the manually and automatically gathered data (Figure 1B).
Our preliminary results suggest that papers with CDSs have less gender imbalance than the broader literature. Figure 1A shows the over/under citation rates of different gender author categories; for papers with CDSs, these rates are lower. Papers with CDSs exhibit less extreme underciting (or slight overciting) of papers with women authors, compared with the broader literature. Further, they exhibit almost no over/underciting of papers with men as first and last authors.
The automatic data collection method works, but there is room for improvement. Figure 1B shows that data gathered manually and automatically roughly correspond to each other, with the automatic method tending to underestimate the citation rates. We found 110 papers manually, and our automatic method found 19 (partly due to the recency of publications with CDSs).
In the future, we hope to extend our database of citations, quantify these visual comparisons, and make more meaningful improvements to our methods. Further, we will continue considering how this project can be more inclusive in its documentation, teamwork, and analyses.
Ultimately, we cannot show that using a CDS causes authors to have less gender imbalance in their reference lists. However, we hope that this project will encourage researchers to reflect on their citation habits and improve them.
Our code can be found at: https://github.com/koudyk/cleanBibImpact.
Figure 1. (A) Percent over- and undercitation of different author gender groups, in reference lists of papers with citation diversity statements (CDSs) compared with the broader neuroscience literature (as reported in (1)). Over/undercitation rate is calculated as the (percentage observed - percentage expected)/percentage expected. The expected percentage is the percent of citations one would expect if gender were not involved in citation choices; these values were calculated in Dworkin et al. The arrows highlight the difference between the over/undercitation rates in the literature versus the rates in papers with CDSs. These data were found manually (n=110 papers). 95% confidence intervals are shown around the mean for each category. (B) Correspondence of the citation rates in papers found automatically and manually (n=19 papers found both ways). The gray diagonal line is the line of equality.
1. Dworkin, J. D., Linn, K. A., Teich, E. G., Zurn, P., Shinohara, R. T., & Bassett, D. S. (2020). The extent and drivers of gender imbalance in neuroscience reference lists. Nature Neuroscience, 23(8), 918–926.
2. Zhou, D., Stiso, J., Cornblath, E., Teich, E., Blevins, A. S., Oudyk, K., Michael, C., Virtual Mario, C. (2020). v1.1.1. https://doi.org/10.5281/zenodo.4104748
Physiological Signal Classification Challenge
BOLD-based functional magnetic resonance imaging (fMRI) is fundamentally a physiological measurement used to investigate neural events. Hence, one of its challenges is decoupling the signal of interest (e.g., brain activity during a task) from the physiological activity, such as breathing or cardiac pulse, that confound the BOLD effect (1). The best practice to overcome this challenge is collecting physiological data during fMRI acquisition, which can be then structured following the Brain Imaging Data Structure (BIDS) specification (2).
In this regard, physiopy is a python suite aimed to preprocess and help spread the use of physiological data in fMRI settings. phys2bids (https://github.com/physiopy/phys2bids) is physiopy’s library aimed to convert physiological recordings into BIDS format. One of the current development goals is the addition of automatic signal classification functionality, which would avoid the time-consuming and error prone task of manually labeling signals. As a starting point, the goal of this project was to explore and test different time-series classification strategies with the EuskaIBUR dataset (3), which includes a total of 240 recordings of four types: cardiac pulse, chest belt, O2 and CO2.
After exploring various alternatives to classify the signals, we opted for the classification pipeline shown in Figure 1. First, signals were preprocessed (low pass-filtered based on local regression with a span of 0.6 s and normalized). Then, the frequency capturing the 75% of the signal power was used to classify between cardiac (f > 0.5 Hz) or respiratory (f < 0.5 Hz). Finally, a time-domain shape metric (sm) was computed as the mean difference between the down tidal signal (sdown) and a straight line (sline). A thresholding strategy was followed to label respiratory signals as chest (sm > -0.02), O2 (-0.1 > sm ≤ -0.02) and CO2 (sm ≤ -0.1).
The classifier managed to discern between respiratory and cardiac perfectly and obtained a 94.17% accuracy overall. These preliminary results evidence that different physiological signals have intrinsic features that can be exploited for automatic classification. Future efforts will take two different paths. First, the algorithm should be thoroughly validated and potentially extended to other physiological signals (e.g., galvanic skin response). Second, we aim to make the software available to the public by implementing it in phy2bids.
1. Bulte, D., Wartolowska, K. Monitoring cardiac and respiratory physiology during FMRI. NeuroImage 154, 81–91 (2017) https://doi.org/:10.1016/j.neuroimage.2016.12.001
2. Gorgolewski, K., Auer, T., Calhoun, V. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044 (2016) https://doi.org/10.1038/sdata.2016.44
3. Moia, S., Termenon, M., Uruñuela, E., Chen, G., Stickland, R. C., Bright, M. G., Caballero-Gaudes, C. ICA-based denoising strategies in breath-hold induced cerebrovascular reactivity mapping with multi echo BOLD fMRI. Neuroimage 233, 117914 (2021) https://doi.org/10.1016/j.neuroimage.2021.117914
Figure 1. Developed physiological signal classification pipeline
Phys2BIDScoin: Integrating BIDScoin and Phys2bids for a User-friendly GUI-based BIDS Conversion of Physiological Files
Phys2bids is a python3 toolkit meant to convert physiological source data files to the Brain Imaging Data Structure (BIDS). BIDScoin (https://github.com/Donders-Institute/bidscoin) is also a python3 toolkit that is meant to convert MRI source data files to BIDS. The aim of this project is to build an interface between the two complementary software packages so that they can both benefit from each others’ strengths and present a unified conversion framework to end-users.
BIDScoin comes with source data discovery functionality and a graphical interface to customize the discovered conversion heuristics. All interactions with the source data are done using plugins and the BIDScoin framework itself is therefore agnostic about the source data modality and type.
Phys2bids is a command-line tool that requires bash skills and technical knowledge about the data in order to be used. Phys2bids is a community-driven effort that supports a wide variety of physiological recordings, especially “non-MRI vendor” data files.
In this Brainhack project, we developed a new plugin named phys2bidscoin that serves as an interface between BIDScoin and phys2bids. Specifically, we achieved to:
•Update part of the main phys2bids workflow to:
оReturn clearer exception types and messages when testing if a source file is supported
оImprove the return of retrieved attributes from source files
•Create a phys2bidscoin data discovery function
•Create a phys2bidscoin data conversion (wrapper) function around phys2bids
•Create a phys2bids bidsmap section with sensible (regexp) heuristics
•Update BIDScoin installation files to allow easier dependencies tracking
The enhancements of the two packages are in alpha (pre-release) stage. While available via github cloning, no CI testing has been set up to cover them. For this reason, in the upcoming months, we will set up testing, in order to create an official release of the new code.
In the future, we will also work to expand supported data types in phys2bids, and to add native YAML support for heuristics. In BIDScoin, we will further develop its package management to allow modular installation of dependencies, for example, in relation to the plugins.
The user-friendliness and increased source data coverage of the phys2bidscoin development facilitates further adoption of the BIDS standard, thus promoting data sharing and reproducibility. Moreover, it reduces the amount of nonscientific work in the scientific process and, hence, allows neuroscientists to devote more time and energy to address their research questions of interest.
The Canadian Open Neuroscience Platform: CONP
The Canadian Open Neuroscience Platform (CONP) provides an infrastructure for the promotion of open-science workflows and the sharing of neuroscience data. The platform goal is to bring together researchers to form an interactive network of collaborations in brain research, with interdisciplinary student training, international partnership, clinical translation, and open publishing. Our goal during the hackathon was to educate participants about the various tools, datasets, and functionalities available on the portal, and to encourage them to share their own data or software. We highlighted the different methodologies and considerations in sharing their work in a FAIR manner, as this is often a barrier in scientific collaborations. We provided a demo, actively trained participants, and offered a Q and A regarding the portal as well as overarching open-science concepts.
The future goal of this project will be to include more tools and datasets, as well as add enhanced search functionalities, data sharing capabilities, additional provenance tools, and many more features. Hackathon training on a platform such as CONP will help researchers, students and developers ensure that smaller datasets are properly integrated for community consumption using a DataLad (1) backend, instead of disappearing entirely. Researchers will benefit from increased sample sizes of properly described datasets, as well as gain from more processing options using well described tools and pipelines. This will make research easier, and ultimately serve to accelerate scientific discovery.
1. Halchenko, Y., Meyer, K., Poldrack, B., Solanky, D., Wagner, A., Gors, J., ... & Hanke, M. (2021). DataLad: distributed system for joint management of code, data, and their relationship. Journal of Open Source Software, 6(63).
The OHBM 2021 was a challenge but also an unequivocal success. To have hundreds of scientists come across such a range of nations and continents is a testament to the spirit of collaboration that is alive and well in the Human Brain Mapping community. We want to thank our sponsors for their generous support: OHBM (11), QMENTA (12), OpenNeuro (13), NeuroMod (14), CONP (15), all of our participants, and all previous members of the Open Science Special Interest Group, without whom we would have never been able to pull off this complex event.
1. Gau, R. et al. Brainhack: Developing a culture of open, inclusive, community-driven neuroscience. Neuron 109, 1769–1775 (2021).
2. Levitis, E. et al. Centering inclusivity in the design of online conferences—An OHBM–Open Science perspective. Gigascience 10, giab051 (2021).
3. OHBM Open-science Special Interest Group. HackTrack. OHBM Brainhack 2021 https://ohbm.github.io/hackathon2021/hacktrack/.
4. Sparkle - create virtual events that sparkle. Sparkle https://sparklespace.com/ (2021).
5. Crowdcast » Where the world gathers. Crowdcast https://www.crowdcast.io/.
6. Video conferencing, cloud phone, webinars, chat, virtual events. Zoom Video Communications https://zoom.us/.
7. YouTube, L. L. C. YouTube. Retrieved (2011).
8. 斗鱼 - 每个人的直播平台. https://www.douyu.com/.
9. Hart, Achakulvisut & Adeyemi. Neuromatch Academy: a 3-week, online summer school in computational neuroscience. J Open Archaeol Data.
10. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016).
11. Home. https://www.humanbrainmapping.org/i4a/pages/index.cfm?pageid=1.
12. Neuroimaging Software With AI-powered Analysis. https://www.qmenta.com/.
13. OpenNeuro. https://openneuro.org/.
14. Home. https://www.cneuromod.ca/.
15. the Canadian Open Neuroscience Platform. https://conp.ca/.
This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits authors to copy and redistribute the material in any medium or format, remix, transform and build upon material, for any purpose, even commercially.