Data Model

The data model for the data connector provides data on activity for users in a group across different content types. The currently supported content types are:

  • Assessments

  • Assignments

  • Certification

  • Courses

  • Practice

  • Projects

  • Tracks

  • Workspace

How to work with the data?

The data available from the Data Connector is modelled according to a dimensional model. This means there are fact, dimension and bridge tables available.

Facts are the measurements/metrics or facts from your user activity on DataCamp.

course_fact for example contains measures like XP and time_spent on courses by your users.

assessment_fact contains measures like score and percentile for the assessment scores your users have registered on DataCamp.

You can join the fact tables with the dimension tables to summarize XP and time spent across technology, topic etc.

The data model also provides dimension tables at the user level user_dim, team_dim, group_dim, and bridge tables user_team_bridge to facilitate analysis at the team or user level.

For example, you can aggregate the xp gained and time_spent by week across technology and team with the following query.

SELECT team_id,
       week_start_date,
       technology,
       SUM(xp) AS xp,
       SUM(time_spent)/3600 AS time_spent_hours
  FROM course_fact
       INNER JOIN course_dim USING(course_id)
       INNER JOIN date_dim USING(date_id)
       INNER JOIN user_team_bridge USING(user_id)
       INNER JOIN team_dim USING(team_id)
 GROUP BY 1, 2, 3

Note that all fact tables only include learning activity for the dates on which a user was a part of the group. For example, consider a user A who joined group 1234 on Jan 1st, started a course on Jan 2nd, left the group on Jan 3rd while continuing to work on the course, rejoined the group on Jan 4th, and completed the course on Jan 5th.

In this case, the fact tables for group 1234 will not contain data for User A for January 3rd (even though the user continued working on the course).

Warning! For now, the Data Connector does not include time spent on mobile, this could result in slight differences from other sources of reporting such as exports or our Group Hub.

Assessments

Assessment Dim

assessment_dim: The assessment dimension provides descriptive data about assessments.

Assessment Fact

assessment_fact: The assessment fact table provides data about the users’ assessment results: score and percentile obtained and time spent on each assessment.

Assignments

Assignment Dim

assignment_dim: The assignment dimension table provides data about a specific assignment.

Assignment Fact

assignment_fact: The assignment fact table provides data about the users’ progress on each assignment.

Certifications

Certification Dim

certification_dim: The certification dimension provides data about a specific certification. Certification ID 1 and 2 are the so-called v1 iterations of certifications. From June 2022 onwards, we switched to a process where an official certification registration was needed, and this registration would time out after 30 days.

Certification Fact

certification_fact: The certification fact table provides data about the users’ progress through different stages of a certification.

Courses

Exercise Dim

exercise_dim: The exercise dimension provides descriptive data about a specific exercise. Because the model considers users’ progress on all exercises (i.e., even exercises that are no longer available), there is a specific row with the id = -1. This row is used in the exercise fact table to refer to a deleted exercise.

Exercise Fact

exercise_fact: The exercise fact table provides data about the users’ progress: XP gained and time spent on each exercise. The table provides data for exercises that can be no longer available (and deleted from our database). In this case, the exercise_id value is set to -1 and links to a “deleted exercise”” row in the exercise_dim table.Fact table for exercise content. It provides engagement and xp measurements on exercise.

Chapter Dim

chapter_dim: The chapter dimension provides descriptive data about a specific chapter. Because the model considers users’ progress on all chapters (i.e., even chapters that are no longer available), there is a specific row with the id = -1. This row is used in the chapter fact table to refer to a deleted chapter.

Chapter Fact

chapter_fact: The chapter fact table provides data about the users’ progress: XP gained and time spent on each chapter. The table provides data for chapters that can be no longer available (and deleted from our database). In this case, the chapter_id value is set to -1 and links to a “deleted chapter” row in the chapter_dim table.

Course Dim

course_dim: The course dimension provides descriptive data about a specific course. Because the model considers users’ progress on all courses (i.e., even courses that are no longer available), there is a specific row with the id = -1. This row is used in the course fact table to refer to a deleted course.

Course Fact

course_fact: The course fact table provides data about the users’ progress: XP gained and time spent on each course. The table provides data for courses that can be no longer available (and deleted from our database). In this case, the course value is set to -1 and links to a “deleted course” row in the course_dim table.

Practice

Practice Dim

practice_dim: List of all practices

Practice Fact

practice_fact: Fact table for practice content. It provides engagement and xp measurements on projects. This table includes practices (aka challenges) after practice replaced challenges.

Projects

Project Dim

project_dim: The project dimension provides descriptive data about a specific project.

Project Fact

project_fact: The project fact table provides data about the users’ project progress: time spent and XP gained.

Tracks

Track Content Dim

track_content_dim: List of content in all live tracks. This table has one row per track_version and content item. It currently only includes courses and projects. Note that a content item can belong to multiple tracks, and different versions of a track might contain different content items.

Track Content Fact

track_content_fact: Fact table for track content.

Track Dim

track_dim: List of all versions of live tracks, along with their title, technology, and descriptions. This table has one row for every track version, for tracks that are live.

Track Fact

track_fact: Fact table for tracks.

Workspaces

Publication Fact

publication_fact: The publication fact table provides data about the users’ workspace publications: number of viewers and time spent viewing by date and viewer type (creator / viewer).

Workspace Dim

workspace_dim: The workspace dimension table provides descriptive data about a specific workspace.

Workspace Fact

workspace_fact: The workspace fact table provides data about workspaces created by users.

Workspace Visit Fact

workspace_visit_fact: The workspace visit fact table provides data user visits to workspaces.

User Teams

Team Dim

team_dim: The team dimension provides descriptive data about teams.

User Dim

user_dim: List of all enterprise users

User Team Bridge

user_team_bridge: Bridge table to link users to their teams

Others

Xp Fact

xp_fact: This fact table provides information on XP gained by a user across different content modalities.

Last updated