From notebook to reproducible result

What you will achieve. By the end of this tutorial you will have run a complete analysis cycle (query → analyse → store), captured the working environment as a named snapshot, prepared a copy of it for an external reviewer, and exported the project as a portable Docker image you can hand to anyone, including someone outside your Nuvolos organisation.

How long it takes. About 60 minutes for the full cycle, depending on how much analysis you actually run. The export step itself can take a while in the background.

What you need before you start. You should have completed 'Your first research project' tutorial - that is, you have a research project on Nuvolos, you can navigate to it, and you have at least one application installed. You also need at least one dataset accessible from your project (either a Nuvolos table distributed to your instance, or a file you have already brought in).

1

Step 1 - Run a complete analysis cycle

The full reference, with both Matlab and RStudio code listed in parallel, lives in How-to › Database research workflow.

A scientific workflow on Nuvolos breaks down into three steps that you'll do back-to-back, in the same application session:

  • Query research-relevant data - pull what you need from a Nuvolos table or file.

  • Analyse - transform, fit, summarise.

  • Store - write the result back somewhere it will outlast your application session.

Open RStudio (or Matlab, or whichever language you prefer). The cycle in RStudio looks like this:

Querying relevant data

The example uses the Fama-French factor set available to demo users, joining a monthly stock series with the 5-factor table for one stock:

SELECT NAF.*, SM.MPRC, SM.MRET*100 AS SM_MRET_100, SM.MTCAP 
FROM NORTH_AMERICA_5_FACTORS NAF 
INNER JOIN TIME_SERIES_MONTHLY SM 
ON SM.MCALDT = NAF.DATE 
WHERE KYPERMNO = 14593

The code that executes the query, the above string is saved in query_string.

conn <- nuvolos::get_connection()
dataset_factor <- dbGetQuery(conn, query_string)

Simple analysis

Fit a linear regression on the data frame, write fitted values back into it:

dataset_factor$EXCESS_RETURN <- dataset_factor$SM_MRET_100 - dataset_factor$RF
mod <- lm(EXCESS_RETURN ~ (-1) + MKT_RF + SMB + HML + RMW + CMA, dataset_factor)
dataset_factor$FIT_FACTOR_5 <- mod$fitted.values

Storing results

Write the result back to the database as a new table:

DBI::dbWriteTable(conn, name="APPLE_5FACTOR_FIT", value=dataset_factor, batch_rows = 10000)

Same cycle, different language

In Matlab the cycle looks identical: get_connection() / select(con, query_string) to query, fitlm() to fit, sqlwrite() to store. The pattern generalises to any Nuvolos-supported language: query, analyse, store.

2

Step 2 - Capture the working environment as a named snapshot

A snapshot at this point is what makes the result reproducible - anyone restoring this snapshot will see the same files, the same installed packages, the same environment in which your result was produced. Without it, you have a result; with it, you have a result that someone else can reach by the same path.

From the left sidebar, hover the camera icon, click TAKE SNAPSHOT AND DESCRIBE, and give the snapshot a meaningful name. Convention from the Nuvolos team: snapshots that represent a stable, citable state of the data or analysis are often called "vintages", for example, "v1.0 - final analysis 2026-05". Add a description that captures what's specifically reproducible about this state (the query you ran, the regression spec, the package versions if you bothered to pin them).

3

Step 3 - Prepare a copy for a reviewer

If you are publishing or peer-reviewing, a journal editor or external reviewer often needs to inspect or re-run your work. The clean way to do this on Nuvolos is to give them an exact copy of your environment - restricted to that copy, with no access to anything else in your organisation.

From the project containing your snapshot:

  1. Create a new instance in the same space (Cogwheel → Project Users / Instances → + ADD NEW INSTANCE). Name it after the reviewer or the journal (e.g. "Reviewer copy - XYZ").

  2. Distribute the snapshot from Step 2 to the new instance - go to your Master instance, stage the relevant files/tables/applications, open the distribute menu, and select the new instance as the target.

  3. Invite the reviewer to the new instance as an Instance Editor. They can see and run everything in that one instance and nothing else in your organisation.

4

Step 4 - Export the project for an external audience

If you need to hand the project to someone you can't invite to Nuvolos at all - a customer, a partner, an auditor, an external collaborator - exporting is the right tool. Application Export turns a Nuvolos application into a portable Docker image that can be deployed outside the platform.

You can export at two levels:

  • The application only - share configuration and runtime setup, but not data, code, or files. Most common when working with third parties.

  • The application together with its files - a complete (with caveats) snapshot of the project state, including file-based artifacts used by the application.

For the exact procedure, see Application Export in Reference. The procedure is documented end-to-end there; this tutorial focuses on the reproducibility flow rather than the export mechanics.

Where to go next

Last updated

Was this helpful?