A complete database research workflow (Matlab & RStudio)

Nuvolos supports a query-analyze-insert cycle to help researchers query, analyse and store data safely in a simple, integrated workflow. The workflow is demonstrated in two languages, but can be extended to any languages supported by Nuvolos.

Matlab workflow

A standard skeleton scientific workflow in Matlab can be broken down into three main steps:

Query research-relevant data
Analyse, transform or otherwise manipulate data
Store results

Querying relevant data

For the mock example, we are going to work with the Fama-French factor set that is available for our Demo user, we will be focusing on the North America 5-factor table. The NORTH_AMERICA_5_FACTORS table has been distributed to the instance we are working in.

After opening the Matlab application, the following bit of code will return the entire database table as a Matlab table-type object. The query we are executing is a merge of a monthly stock series table (for Apple monthly stock prices) and the Fama-French factor table as follows.

The SQL query:

SELECT NAF.*, SM.MPRC, SM.MRET*100, SM.MTCAP 
FROM NORTH_AMERICA_5_FACTORS NAF 
INNER JOIN TIME_SERIES_MONTHLY SM 
ON SM.MCALDT = NAF.DATE 
WHERE KYPERMNO = 14593

The code that executes the query, the above string is saved in query_string.

con = get_connection();
dataset_factor = select(con, query_string);

The simple analysis

The previous step resulted in dataset_factor containing a Matlab Table object that holds the data. The fitlm method fits a linear regression on the table with an R-style formula. We then write back the fitted values to the Table as a column.

dataset_factor.EXCESS_RETURN = dataset_factor.SM_MRET_100 - dataset_factor.RF;
mod = fitlm(dataset_factor,'EXCESS_RETURN ~ (-1) + MKT_RF + SMB + HML + RMW + CMA')
dataset_factor.FIT_FACTOR_5 = mod.Fitted;

Storing results in the database

As a final step, we write back the results using the data upload command for Matlab:

sqlwrite(con,'APPLE_5FACTOR_FIT',dataset_factor);

As an end result, we can then find the table on the Nuvolos UI:

RStudio workflow

A standard skeleton scientific workflow in RStudio can be broken down into three main steps:

Query research-relevant data
Analyse, transform or otherwise manipulate data
Store results

It is possible to create more complext workflows, but it will usually consist of three-step modules such as above.

Querying relevant data

After opening the RStudio application, the following bit of code will return the entire database table as an R data frame object. The query we are executing is a merge of a monthly stock series table (for Apple monthly stock prices) and the Fama-French factor table as follows.

The SQL query:

SELECT NAF.*, SM.MPRC, SM.MRET*100 AS SM_MRET_100, SM.MTCAP 
FROM NORTH_AMERICA_5_FACTORS NAF 
INNER JOIN TIME_SERIES_MONTHLY SM 
ON SM.MCALDT = NAF.DATE 
WHERE KYPERMNO = 14593

The code that executes the query, the above string is saved in query_string.

conn <- nuvolos::get_connection()
dataset_factor <- dbGetQuery(conn, query_string)

The simple analysis

The previous step resulted in dataset_factor containing an R data.frame object that holds the data. The lm method fits a linear regression on the data frame. We put the fitted values to the data frame.

dataset_factor$EXCESS_RETURN <- dataset_factor$SM_MRET_100 - dataset_factor$RF
mod <- lm(EXCESS_RETURN ~ (-1) + MKT_RF + SMB + HML + RMW + CMA, dataset_factor)
dataset_factor$FIT_FACTOR_5 <- mod$fitted.values

Storing results in the database

As a final step, we write back the results using the data upload command for R:

DBI::dbWriteTable(conn, name="APPLE_5FACTOR_FIT", value=dataset_factor, batch_rows = 10000)

PreviousImporting data on Nuvolos NextAccessing data as data.frames in R

Last updated 2 years ago

Was this helpful?

hashtagMatlab workflow

hashtagQuerying relevant data

hashtagThe simple analysis

hashtagStoring results in the database

hashtagRStudio workflow

hashtagQuerying relevant data

hashtagThe simple analysis

hashtagStoring results in the database

Matlab workflow

Querying relevant data

The simple analysis

Storing results in the database

RStudio workflow

Querying relevant data

The simple analysis

Storing results in the database