Enterprise Reporting
  • Introduction
  • Understanding reports with clarity (Definitions)
  • Optimizing key performance indicators (via the Groups tab)
    • Dashboard
    • Reporting section
      • Progress report
      • Adoption report
      • Engagement report
      • Content insights
        • XP
        • Courses
        • Projects
        • Tracks
      • Assessments
      • Certifications
      • Time in Learn
      • DataLab
      • Export
    • Skill Matrix
  • Integrating our data into your tools (via Data Connector 2.0)
    • Explore the data model
      • Fact tables
      • Dimension tables
      • Bridge tables
      • Metrics tables
    • Common use cases
    • Sample queries
    • Queries to recreate key reports in the Groups tab
      • Dashboard
        • Members who have earned XP
      • Reporting section
        • Progress report
        • Content insights
        • Assessments
        • Certification Insights
        • Time in Learn
    • Domain Gotchas
    • Getting started with Data Connector 2.0
      • Enable Data Connector 2.0
      • Your credentials
      • Storing your credentials
    • Using Data Connector 2.0
      • Integrating with your BI tools
        • Microsoft Power BI
        • Tableau
        • Looker
        • DataLab
        • Python with Boto3
      • Downloading your data
        • S3 Browser (Windows)
        • Cyberduck (Mac or Windows)
        • AWS CLI (Linux)
    • Changelog
    • Migrating from Data Connector 1.0
  • FAQ
  • Data Connector 1.0 - Documentation
    • [Data Connector 1.0] Explore Data Model
      • [Data Connector 1.0] Data Model
      • [Data Connector 1.0] Changelog
      • [Data Connector 1.0] Example queries
    • [Data Connector 1.0] Getting started
      • [Data Connector 1.0] Enabling the Data Connector
      • [Data Connector 1.0] Your Credentials
      • [Data Connector 1.0] Storing your Credentials
    • [Data Connector 1.0] Using the Data Connector
      • [Data Connector 1.0] Analyzing data
        • [Data Connector 1.0] DataLab
        • [Data Connector 1.0] Microsoft Power BI
        • [Data Connector 1.0] Tableau
      • [Data Connector 1.0] Downloading data
        • [Data Connector 1.0] S3 Browser (Windows)
        • [Data Connector 1.0] 3Hub (Mac)
        • [Data Connector 1.0] AWS CLI (Linux)
    • [Data Connector 1.0] Data Connector FAQ
      • [Data Connector 1.0] Deprecating dcdcpy and dcdcr
Powered by GitBook
On this page
  • Examples
  • Get a list of all users
  • Time spent in Learn per technology
  • More examples?
  1. Integrating our data into your tools (via Data Connector 2.0)
  2. Using Data Connector 2.0
  3. Integrating with your BI tools

Python with Boto3

This page describes how to easily use the boto3 library in Python to analyze data from the Data Connector.

Before you begin, please ensure that your credentials are correctly set up. How to do this is explained in the Your credentials article.

Examples

Get a list of all users

This script retrieves a list of all users from the Data Connector and stores it in a pandas dataframe.

import pandas as pd
import boto3

S3_BUCKET_NAME = "<your bucket name here>"

# Create client, authentication is done through environment variables
s3_client = boto3.client('s3')

# Utility method to get a file and load it into a df
def getDataFrameFromS3(table):
    key = f'latest/{table}.csv'
    response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Key=key)
    return pd.read_csv(response['Body'])

# Get the dimension CSV file that contains all users
dim_user = getDataFrameFromS3('dim_user')
print(dim_user)

Time spent in Learn per technology

Each content type at DataCamp has an associated technology (e.g., R, Python, SQL, Spark, etc.). With the code below, you can create a report with the time spent per technology.

import pandas as pd
import boto3

S3_BUCKET_NAME = "<your bucket name here>"

# Create client, authentication is done through environment variables
s3_client = boto3.client('s3')

# Utility method to get a file and load it into a df
def getDataFrameFromS3(table):
    key = f'latest/{table}.csv'
    response = s3_client.get_object(Bucket=S3_BUCKET_NAME, Key=key)
    return pd.read_csv(response['Body'])

# Get required data frames
fact_learn_events = getDataFrameFromS3('fact_learn_events')
dim_content = getDataFrameFromS3('dim_content')

# Merge the dataframes
result = fct_learn_events \
    .merge(dim_content, on='content_id', how="left") \
    [['technology', 'duration_engaged']]
    
# Filter for rows where duration_engaged is greater than zero
result_filtered = result[result['duration_engaged'] > 0]

# Turn duration_engaged into hours
result_filtered['duration_engaged'] = result_filtered['duration_engaged'] / 3600

# Calculate time spent per technology
result_grouped = result_filtered.groupby('technology')['duration_engaged'].sum()

print(result_grouped.sort_values(ascending=False))

More examples?

Please review our sample queries and queries that recreate key reports in the Groups tab.

Reach out to your customer success manager, and we are happy to help you get the data you need using Python or SQL.

PreviousDataLabNextDownloading your data

Last updated 7 days ago