# Fact tables

Fact tables store quantitative information (measurements/metrics) about events (facts) based on users' activity on DataCamp.

* The fact tables available are:
  * [**fact\_learn\_events**](#fact_learn_events)
  * [**fact\_certification\_events**](#fact_certification_events)
  * [**fact\_datalab**](#fact_datalab_events)[**\_events**](#fact_workspace_events)
  * [**fact\_permission\_events**](#fact_permission_events)

Events in the fact tables only include activity that occurred **while the user was part of the group**. This means that any events from before the user joined or after they left the group are not included in these tables. As a result, the recorded activity reflects only the period during which the user was associated with the group, rather than their complete history on the platform. This is an important distinction to keep in mind when aggregating metrics using fact tables, as the totals may not represent a user’s full learning journey on DataCamp.

### fact\_learn\_events

This table captures learning events and user interactions on the platform.

These are the event types in this table:

* Starts and completion events, arguably the most common, reflect content starts and/or completions:

  * `assessment_started`  and `assessment_completed`
  * `course_started` and `course_completed`
  * `chapter_started` and `chapter_completed`
  * `exercise_completed`
  * `practice_completed`
  * `project_started` and `project_completed`
  * `track_started` and `track_completed`
  * `assignment_completed` and `assignment_completed_late`– The assignment was completed on time (on or before the due date) or late (after the due date).
  * `assignment_assigned` – The assignment was issued to the assignee.
  * `assignment_missed` – The assignment was not completed before the due date.
  * `assignment_unassigned` – The assignment was deleted after the creation.

  When an assignment is missed, an `assignment_missed` event is logged. If the user later completes the assignment after the due date, an `assignment_completed_late` event is also recorded. In this case, both the missed and late completion events are preserved.

{% hint style="info" %}
In assignment events, the `duration_engaged` and `xp_earned` columns remain NULL. Engagement time and XP are recorded through the learner’s interactions with the assigned content, such as `course_engagement` and `assessment_engaged`.
{% endhint %}

* Engagement events that reflect user interaction with particular content, such as time spent and XP earned:
  * `assessment_engaged`
  * `course_engagement`
  * `practice_engagement`
  * `project_engagement`&#x20;
* One-off XP events:
  * `alpa_onboarding`&#x20;
  * `b2b_onboarding_xp_boost`

{% hint style="info" %}
We stopped awarding `b2b_onboarding_xp_boost` in October 2024
{% endhint %}

Most fact events are associated with a specific content item via the `content_id` column, which can be used to perform a join with the `dim_content` table. For track events, the `track_version_id` column is used (and the `content_id` column will be empty), which can be used to join with the `dim_track` table. Assignment events use `assignment_id` to join with `dim_assignment` and retrieve assignment details.

Every fact event has a `occurred_at` timestamp that determines when the event happened.

Additionally, content-specific fields will have data when the event is related to specific content types. For example, `assessment_score` will contain the assessment score for `assessment_completed` events.

{% hint style="info" %}
For course-related and chapter-related events (e.g., `course_started`, `course_completed`, `chapter_started`, `chapter_completed`, `course_engagement`), the `course_variant_id` column identifies which variant the user interacted with. A user can start and complete both the classic and AI-native variant of the same course, resulting in separate events for each variant. Use `course_variant_id` to filter or group by variant. Join with `dim_course_variant` for variant metadata.
{% endhint %}

| Field                        | Description                                                                                                                                                          |
| ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| group\_id                    | The unique group identifier                                                                                                                                          |
| user\_id                     | The unique user identifier                                                                                                                                           |
| date\_id                     | The date identifier (YYYYMMDD)                                                                                                                                       |
| content\_id                  | The unique content identifier                                                                                                                                        |
| course\_variant\_id          | The course variant identifier. `1` = classic, `2` = AI-native. Populated for course and chapter events. Can be joined with `dim_course_variant` for variant details. |
| track\_version\_id           | The unique track identifier                                                                                                                                          |
| assignment\_id               | The unique assignment identifier                                                                                                                                     |
| event\_name                  | The name of the event                                                                                                                                                |
| occurred\_at                 | The timestamp when the event took place (UTC)                                                                                                                        |
| xp\_earned                   | The XP earned on the event                                                                                                                                           |
| duration\_engaged            | The time (in seconds) the user spent engaged with the particular content item                                                                                        |
| assessment\_score            | The assessment score (0-200)                                                                                                                                         |
| assessment\_percentile       | The percentile that corresponds to the assessment score                                                                                                              |
| assessment\_knowledge\_level | The skill level (Novice, Lower Intermediate, Upper Intermediate, Lower Advanced, Upper Advanced) that correlates to the assessment score                             |
| course\_is\_skipped          | Determines whether the course was skipped. Only used for `course_completed` events.                                                                                  |

{% hint style="warning" %}
Not all events in the fact learn events table will have a corresponding content item in `dim_content`. This is due to a limitation in our system, where some content items are hard deleted on our end. Despite this, we still include events related to these deleted content items in the fact table because they represent valuable user activity—users gained XP and spent time engaging with the content. Even if the content reference is missing, these events provide meaningful insights into user learning behavior and should not be excluded from reporting.\
\
Please check the [Domain Gotchas](https://enterprise-docs.datacamp.com/integrating-our-data-into-your-tools-via-data-connector-2.0/domain-gotchas) section for more details.
{% endhint %}

### fact\_certification\_events

This table captures certification events and user interactions on the platform.

These are the event types in this table:

* Certification milestone events
  * `certification_registered`
  * `certification_withdrawn`&#x20;
  * `certification_failed`
  * `certification_expired`
  * `certification_granted`
* Certification component events
  * Attempts
    * `certification_attempt_expired`&#x20;
    * `certification_out_of_attempts`
  * Case Studies
    * `certification_case_study_registered`&#x20;
    * `certification_case_study_presentation_submitted`
    * `certification_case_study_graded`
    * `certification_case_study_failed`
    * `certification_case_study_passed`
  * &#x20;Project
    * `certification_project_passed`
  * Skill Assessment
    * `certification_skill_assessment_failed`
    * `certification_skill_assessment_passed`

Most factual events are associated with a specific certification attempt via the `certification_id` column, which can be used to perform a join with the `dim_certification` table. Every time a user attempts a certification, they are assigned a unique `attempt_id.`

Every fact event has a `occurred_at` timestamp that determines when the event happened.

| Field             | Description                                   |
| ----------------- | --------------------------------------------- |
| group\_id         | The unique group identifier                   |
| user\_id          | The unique user identifier                    |
| date\_id          | The date identifier (YYYYMMDD)                |
| certification\_id | The unique certification identifier           |
| event\_name       | The name of the event                         |
| xp\_earned        | The XP earned on the event                    |
| occurred\_at      | The timestamp when the event took place (UTC) |
| attempt\_id       | The unique certification attempt identifier   |

### fact\_datalab\_events

This table captures events related to interactions within DataLab on the platform.

These are the event types in this table:

* Workbook creation:
  * `workspace_created`
* Engagement events that reflect interactions with DataLab workbooks:
  * `workspace_publication_viewed`
  * `workspace_viewed`&#x20;
    * When a user spends time viewing a DataLab workbook
  * `workspace_visited`&#x20;
    * When a user spends time either viewing or editing a workbook

{% hint style="warning" %}
Please note that the workspace\_viewed events are a subset of workspace\_visited events. Only one of the two events should be used when calculating time spent on DataLab workbooks.
{% endhint %}

Most fact events are associated with a specific DataLab workbook via the `workbook_id` column. The `workbook_id` is used in the URL for the workbook. DataLab workbooks have the following URLs: `https://www.datacamp.com/datalab/w/workbook_id`.

Every fact event has a `occurred_at` timestamp that determines when the event happened.

| Field             | Description                                                       |
| ----------------- | ----------------------------------------------------------------- |
| group\_id         | The unique group identifier                                       |
| user\_id          | The unique user identifier                                        |
| date\_id          | The date identifier (YYYYMMDD)                                    |
| workbook\_id      | The unique workbook identifier                                    |
| event\_name       | The name of the event                                             |
| occurred\_at      | The timestamp when the event took place (UTC)                     |
| duration\_engaged | The time (in seconds) when the user was engaged with the workbook |

### fact\_permission\_events

This fact table captures permission, license and invite related events across the platform.

**Event types in this table include:**

* **Subscription:** `subscription_started`, `subscription_ended` (with related subscription metadata, e.g., license counts).
* **License:** `user_license_allocated`, `user_license_revoked`.
* **Invites:** `invite_sent`, `invite_claimed`, `invite_withdrawn`.

Each event is one row and carries an `occurred_at` timestamp indicating when the event happened.

| Field                   | Description                                                                |
| ----------------------- | -------------------------------------------------------------------------- |
| group\_id               | The unique group identifier if the event is associated with the group      |
| team\_id                | The unique team identifier if the event is associated with a specific team |
| user\_id                | The unique user identifier if the event is associated with a specific user |
| email                   | The email of the person invited                                            |
| date\_id                | The date identifier (YYYYMMDD)                                             |
| event\_name             | The name of the event                                                      |
| occurred\_at            | The timestamp when the event took place (UTC)                              |
| invite\_id              | The unique invite identifier                                               |
| is\_active              | Indicates whether the subscription or the license is currently active      |
| product                 | Product the event is associated to, e.g. 'learn' or 'workspace'            |
| license                 | License the event is associated to, e.g. 'enterprise', 'teams', etc...     |
| slug                    | The product slug, e.g. 'learn.enterprise'                                  |
| nb\_licenses\_purchased | The number of licenses purchased.                                          |
