Workspaces vs Repositories

Worspaces vs Repositories

./research_data_cycle.png

There is a fundamental distinction between repositories and workspaces and the role they play in a FAIR-compliant research workflow

Examples of Workspaces

Generic workspace examples include:

  • Survey tools (Survey Monkey, ReDCAP)
  • Electronic Lab Notebooks
  • Code environments (Jupyter notebooks, Binderhub, Github*, Matlab, R Studio)
  • Research & Analytical Databases**
  • “R: Drive” or similar storage
  • Cloud storage such as Dropbox, Google Drive or OneDrive

*Yes, git uses “repositories” but these do not function as archives – don’t assume Microsoft, the current owner of github, will preserve research or other code

**No a database is not necessarily a repository (more on that below)

Features of a repository

⬇️ Governance / Policy ⬇️⬇️ Technology ⬇️
  • Purpose / mission

  • Data retention policies

  • Planning:

    • ongoing stewardship & contingencies

    • persistence of IDs

    • data exit

    • software obsolescence

  • Deposit and use and redistribution licensing/permissions framework

  • Tested data-exit pathway*

  • Interoperable metadata framework

  • APIs

  • Implementation of ID resolution & updating

  • Discovery services / catalogues / indicies

    • Collection / archival structures

    • Full-text & semantic indexing

    • Facets for discovery based on explicit and implicit metadata

  • Sustainable technology neutral access-control mechanisms

*Get this right and the other things follow

⬆️ Repo as Institution ⬆️⬆️ Repo Implementation ⬆️

A repository is as much an institution as it is a software implementation.

CARE / FAIR

⬇️ Governance / Policy ⬇️⬇️ Technology ⬇️
CARE PrinciplesFAIR Principles
⬆️ Repo as Institution ⬆️⬆️ Repo Implementation ⬆️

Differences between Repositories and Archives (and why we say Archival Repository)

Repository and Archive are closely related terms used by different communities – they both have a core meaning about ‘keeping’ something for an appropriate time span. Some non-archivists might quip that archives are where things go never to be seen again while to non-archivists Repositories, or digital libraries may lack the rigour of proper archival practice and use unfamiliar organizing principles.

Repositories, in the Higher Education sector are probably best known for their role in the Open Access publications movement – while these are typically called “Institutional Repositories”, it is helpful to think of them as “Institutional Publications Repositories”. Many of these do contain research data, but they typically do not have access controls for non-open data, have size limits for deposits and do not have built in domain-extensibility.

In the RRKive project we aim to set aside these interesting discussions in the interest of finding common ground so we use the term Archival Repository.