R with paws

The dcdcr (deprecated) page explains how to use the DataCamp Data Connector using a convenient R package created by DataCamp. However, if you cannot install the package, you can also use the opensource paws package instead.

The dcdcr package is actually just a wrapper of the paws package custom built to support our Data Model.

This page describes how you can easily use the paws package to use R to analyze the data from the Data Connector.

Examples

Get a list of all users

This script retrieves a list of all users from the Data Connector and stores it in a tibble.

library(dplyr)

library(paws)



S3_BUCKET_NAME <- "<your bucket name here>"



# Create client, authentication is done through environment variables

s3 <- paws::s3()



# Utility method to get a file and load it into a df

get_data_frame_from_s3 <- function(table){
  
    key <- paste0("latest/", table, ".csv")
  
    response <- s3$get_object(Bucket = S3_BUCKET_NAME,
     
                         Key = key)
  
    df <- response$Body %>% 
     rawToChar %>%
     read.csv(text = .) %>%
     as_tibble()
     


     return(df)

}




user_dim <- get_data_frame_from_s3('user_dim')



user_dim 

Get time spent on courses for a single user

This code will print a dataframe with all course activity for a single user, it lists the course, the amount of time spent (in seconds) and date for all their learning sessions.

library(dplyr)

library(paws)



S3_BUCKET_NAME <- "<your bucket name here>"
USER_EMAIL = 'john.doe@datacamp.com'




# Create client, authentication is done through environment variables
s3 <- paws::s3()


# Utility method to get a file and load it into a df

get_data_frame_from_s3 <- function(table){
  
    key <- paste0("latest/", table, ".csv")
  
    response <- s3$get_object(Bucket = S3_BUCKET_NAME,
     
                         Key = key)
  
    df <- response$Body %>% 
     rawToChar %>%
     read.csv(text = .) %>%
     as_tibble()
     

     return(df)
}



# Get required data frames

course_fact <- get_data_frame_from_s3('course_fact')

course_dim <- get_data_frame_from_s3('course_dim')

user_dim <- get_data_frame_from_s3('user_dim')



# Merge the dataframes

result <- course_fact %>%
 left_join(course_dim, by = "course_id") %>%
 left_join(user_dim, by = "user_id") %>%
 select(course_id, title, time_spent, date_id, email)



# Filter results on a single user

result <- result %>% filter(email == USER_EMAIL)



result

More examples?

Reach out to your customer success manager and we are happy to help you get the data you need using our R library or SQL.

Last updated

Was this helpful?