> For the complete documentation index, see [llms.txt](https://docs.nuvolos.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.nuvolos.com/how-to-guides/workflows-for-researchers/a-complete-database-research-workflow-matlab-and-rstudio.md).

# Database research workflow - Matlab & RStudio

<mark style="color:$primary;">**Outcome**</mark>\
You query data from the Scientific Data Warehouse, analyse it in MATLAB or RStudio, and write results back as a new table.

<mark style="color:$primary;">**Before you start**</mark>

* Tables are activated in your space. See [Reference › Data storage](/reference/data-storage.md) if not.
* You have at least one MATLAB or RStudio application configured.
* You have identified the source tables for your analysis.

Nuvolos supports a *query–analyse–insert* cycle. The example below uses the Fama-French 5-factor table (`NORTH_AMERICA_5_FACTORS`) joined with a monthly stock returns table - the same pattern extends to any tables you have access to.

A standard skeleton scientific workflow in MATLAB/RStudio can be broken down into three main steps:

1. Query research-relevant data
2. Analyse, transform or otherwise manipulate data
3. Store results

You can create more complex workflows, but they will usually be built from three-step modules like the one above.

## MATLAB workflow

{% stepper %}
{% step %}

### Querying relevant data

After opening the MATLAB application, define your SQL query as a string and execute it:

{% code overflow="wrap" %}

```sql
query_string = "SELECT NAF.*, SM.MPRC, SM.MRET*100 AS SM_MRET_100, SM.MTCAP " + ...
               "FROM NORTH_AMERICA_5_FACTORS NAF " + ...
               "INNER JOIN TIME_SERIES_MONTHLY SM " + ...
               "ON SM.MCALDT = NAF.DATE " + ...
               "WHERE KYPERMNO = 14593";
 
con = get_connection();
dataset_factor = select(con, query_string);
```

{% endcode %}
{% endstep %}

{% step %}

### The simple analysis

Use MATLAB's fitlm to fit a linear regression with an R-style formula:

{% code overflow="wrap" %}

```sql
dataset_factor.EXCESS_RETURN = dataset_factor.SM_MRET_100 - dataset_factor.RF;
 
mod = fitlm(dataset_factor, ...
  'EXCESS_RETURN ~ (-1) + MKT_RF + SMB + HML + RMW + CMA');
 
dataset_factor.FIT_FACTOR_5 = mod.Fitted;
```

{% endcode %}
{% endstep %}

{% step %}

### Write results back

Use the sqlwrite command to push the results back to the warehouse as a new table:

{% code overflow="wrap" %}

```sql
sqlwrite(con, 'APPLE_5FACTOR_FIT', dataset_factor);
```

{% endcode %}
{% endstep %}
{% endstepper %}

## RStudio workflow

{% stepper %}
{% step %}

### Querying relevant data

After opening the RStudio application, define your SQL query as a string and execute it:

{% code overflow="wrap" %}

```sql
query_string <- "
SELECT NAF.*, SM.MPRC, SM.MRET*100 AS SM_MRET_100, SM.MTCAP
FROM NORTH_AMERICA_5_FACTORS NAF
INNER JOIN TIME_SERIES_MONTHLY SM
  ON SM.MCALDT = NAF.DATE
WHERE KYPERMNO = 14593
"
 
conn <- nuvolos::get_connection()
dataset_factor <- dbGetQuery(conn, query_string)
```

{% endcode %}
{% endstep %}

{% step %}

### The simple analysis

Use RStudio's fitlm to fit a linear regression with an R-style formula:

{% code overflow="wrap" %}

```sql
dataset_factor$EXCESS_RETURN <- dataset_factor$SM_MRET_100 - dataset_factor$RF
 
mod <- lm(EXCESS_RETURN ~ (-1) + MKT_RF + SMB + HML + RMW + CMA,
          dataset_factor)
 
dataset_factor$FIT_FACTOR_5 <- mod$fitted.values
```

{% endcode %}
{% endstep %}

{% step %}

### Write results back

As a final step, we write back the results using the data upload command for R:

{% code overflow="wrap" %}

```sql
DBI::dbWriteTable(conn, name = "APPLE_5FACTOR_FIT",
                  value = dataset_factor, batch_rows = 10000)
```

{% endcode %}
{% endstep %}
{% endstepper %}

For the full data warehouse reference - Python and Stata connectors, off-Nuvolos access, ODBC drivers, big-data exports - see [Reference › Data storage](/reference/data-storage.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.nuvolos.com/how-to-guides/workflows-for-researchers/a-complete-database-research-workflow-matlab-and-rstudio.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
