Snapshots, distribution, and states
Reproducibility is one of the central design goals of Nuvolos: anyone who receives your work should be able to reproduce it without setup, and you should be able to return to any past version of your own work without ceremony. Three concepts make this possible - states, snapshots, and distribution - and they are best understood together rather than separately. This chapter explains each one, and then how they combine to support versioning, sharing, and recovery.
States: the mutable and the immutable
Every Instance has exactly one current state - the live, mutable version where you are actively working. Files you save, tables you create, and packages you install all change the current state.
Every Instance can also have any number of snapshot states - frozen, immutable copies of what the current state looked like at a particular moment. Snapshots cannot be modified; they can only be created, viewed, restored from, distributed from, or deleted.
This split is the foundation of reproducibility. The current state lets you work fluidly. Snapshot states preserve the past in a form that cannot drift.
Snapshots: complete, immutable, persistent, restorable, shareable
A snapshot captures the current state of an instance - files, database tables, Application configurations, and dependencies - in one operation. Every snapshot has five properties:
Complete - everything in the Instance is saved as a single unit. There is no risk of partially captured state.
Immutable - the copy is read-only from the moment it is created. No data, code, or setting can be changed after the fact.
Persistent - the snapshot remains available until explicitly deleted by a user.
Restorable - you can return an Instance to a previous snapshot at any time, making it safe to experiment without fear of losing work.
Shareable - snapshots can be distributed to other Instances, colleagues, or students, carrying the full working environment with them.
Taken together, these properties make snapshots the foundation of reproducibility on Nuvolos: if you can snapshot it, you can version it, restore it, and share it - and anyone who receives it gets exactly what you had.
Distribution: the push mechanism
Distribution is the mechanism in Nuvolos for sharing the content of your work with others. You can distribute:
Files, sets of files, or entire directories
Database tables or sets of tables
Applications with all their configurations and dependencies
Entire snapshots containing the full state of your work
Recipients can be members of your own project team, students in your course, or external colleagues and referees.
Distribution is a push operation: you select the items you want to share and push them to a target location. This happens through the Stage - a temporary area for collecting objects to be distributed. During the process, you choose how to handle conflicts (overwrite, skip, or rename), in the same spirit as a file manager.
Because applications and data in Nuvolos are containerised (as explained in the sections on applications and snapshots), it is possible to share them individually or wholesale. Instead of emailing files around, you select what to share in the Nuvolos UI and Nuvolos handles the rest - preserving the relationship between code, data, and Application configuration that makes work actually reproducible.
For step-by-step instructions on distributing objects, see Object Distribution.
Distribution use cases
The three concepts compose into a small number of patterns that show up everywhere on Nuvolos.
Versioning
Take a snapshot whenever you reach a milestone - a working version of an analysis, a tested course module, a clean dataset. The snapshot is a permanent reference point. The current state continues evolving from there, and you can always restore from the snapshot if a later change does not pan out.
Safe experimentation
Before a risky change - a major package upgrade, a refactor, a destructive query - take a snapshot of the current state. If the change works, keep going. If it doesn't, restore from the snapshot and you are exactly where you started. This is conceptually similar to a version control checkpoint, but it captures the entire environment, not just the code.
Partial restore
Distribution can also be used as a partial restore mechanism. If you only want to recover a single file or table from a previous snapshot - without rolling back the entire Instance - you can stage the specific objects from the snapshot and distribute them back to the current state. Everything else stays as it was.
Course delivery
Build and test course material in the Master Instance, snapshot it, and distribute the snapshot to all students. Every student receives the identical environment, ready to use. Materials can be updated mid-course by distributing again - the previous snapshot remains intact as a record of what students received originally.
Assignment audit trails
Assignments are a special type of distribution that also:
Adds a deadline for a student response (hand-in).
Creates dedicated storage for responses and instructor feedback.
Creates an audit trail of all actions taken in response to the distribution.
Onboarding new collaborators
Take a snapshot of the project as it currently stands and distribute it to a new team member's Instance. They start working immediately with the full working environment - same files, same data, same Application configuration - without spending a day setting up infrastructure.
Research reproducibility
Snapshot the exact combination of code, data, and Application configuration that produced a result. Share it with referees, collaborators, or as supplementary material for a publication. Recipients can reproduce the work without configuration steps - the snapshot already contains everything.
The Distributed Instance
Course Spaces include a special Instance called Distributed that exists alongside the Master and any student Instances. The Distributed Instance accumulates everything that has been mass-distributed to the Space - it is the canonical record of what students were given.
Two properties make it work:
Pushing content to the Distributed Instance happens automatically when you execute distribute to all. This is also how you push corrections - overwriting an existing file in the Distributed Instance updates what new students receive.
All editors of all Instances in the Space have viewer access to the Distributed Instance, so any editor can pull content from it back to their own Instance - useful for resetting a broken Application to the instructor-provided state.
You can also recall files directly from the Master instance.
Why deleted data is not always reclaimed
Datasets and tables are stored persistently and tracked across all snapshots. This design ensures consistency and reproducibility, but it has a consequence that surprises some users: data is not automatically deleted from storage when removed only from the current state.
Snapshots are immutable: Once a snapshot is created, it preserves the state of the Instance at that point in time. You cannot selectively modify or remove parts of a snapshot.
Shared data references: Tables or files deleted in the current state may still be retained in older snapshots, which continue to reference the same underlying data.
Deleting a table or file from the current state alone does not free up storage space. To fully reclaim storage, every snapshot that contains the data must also be deleted. This is intentional: historical states remain intact unless explicitly removed, which is what makes them trustworthy as references.
Where to go next
For step-by-step procedures, see How-to › Shared workflows › Snapshots and How-to › Shared workflows › Distribution.
For distribution strategy options (Overwrite, Skip, and others), see Reference › Configuration › Distribution strategies.
For the role-specific application of these patterns, see the relevant tutorials in Tutorials › For Researchers and Tutorials › For Instructors.
Last updated
Was this helpful?