# Domain Gotchas

In this section, we'll explore some common gotchas and pitfalls that can arise when working with Data Connector 2.0. These pitfalls can lead to inaccurate reports if not considered.&#x20;

### Calculating XP

The correct way to count XP is not to simply do a SUM on the `xp_earned` column of the `fact_learn_events` table, as this will end up in double counting. For example, for course XP, we have both `course_engagement` and `exercise_completed` events with a non-null `xp_earned` value; both contain XP gained from completing an exercise, so including both would lead to double counting XP earned in courses.

Starting from December 2025, certification completions also grant XP, and these events must be included in the total XP calculation.

To correctly count XP and have it match what is reported in the Groups tab, only use the following events:

* `assessment_engaged`&#x20;
* `certification_granted` (`fact_certification_events`)
* `course_engagement`
* `practice_engagement`
* `project_engagement`
* `b2b_onboarding_xp_boost`
* `alpa_onboarding`&#x20;

{% hint style="warning" %}
The sum of a user's XP in this table should not be expected to equal the total XP they have on the platform. This is because, in the Data Connector, admins can only see the activity that a user completed while they were part of the group. Any XP earned outside of the group—such as before joining, or after leaving—will not be reflected in this dataset.
{% endhint %}

### Missing Content Items in dim\_content&#x20;

Not all events in the `fact_learn_events` table will have a corresponding content item in `dim_content`. This is because some content items may be hard deleted in our system, meaning they are permanently removed rather than being soft deleted or archived. As a result, any events tied to these deleted content items will no longer have a valid `content_id` reference in `dim_content`.

Despite this limitation, we still retain these events in the `fact_learn_events` table. These events represent real user activity, including time spent and XP earned, and are valuable for tracking engagement. When analyzing data, be aware that some events may not join to `dim_content`, but they remain important for understanding overall user behavior.

### Course Variant

#### **Duplicate events when a course has multiple variants**

With the introduction of course variants (**classic** and **AI-native**), a user can now start and complete both variants of the same course. This means:

* A `course_completed` event can appear **twice** for the same user and course, once for each variant.
* The same applies to `course_started`, `chapter_started`, `chapter_completed`, and engagement events.

If you are counting course completions, starts, or time spent, decide whether you want to measure **per variant** or **per course**:

* **Per variant:** No changes needed, each variant's events are already separate rows. You can add `course_variant_id` to your `GROUP BY` or `SELECT` clause for clarity.
* **Per course (deduplicated):** Filter to a single variant (e.g., `WHERE course_variant_id = 1` for classic only) or deduplicate using `DISTINCT` on the combination of `user_id` and `course_id`.

#### AI-native chapters are new records in dim\_content

AI-native courses introduce new chapter records in `dim_content`. These chapters:

* Have their own unique `content_id` and `chapter_id` values
* Are associated with `course_variant_id = 2`
* Share the same `course_id` as the classic chapters of the same course

If you aggregate metrics at the chapter level (e.g., chapters started, chapters completed), your totals will now include both classic and AI-native chapters. To analyze only one variant, filter using `course_variant_id`:

```sql
-- Classic chapters only
WHERE content_type = 'chapter' AND course_variant_id = 1

-- AI-native chapters only
WHERE content_type = 'chapter' AND course_variant_id = 2
```

#### `chapter_id` is now a string

The `chapter_id` column in `dim_content` has been changed from an integer to a string type. This is a **breaking change** for existing queries that:

* Cast `chapter_id` to an integer, e.g., `CAST(chapter_id AS INT)` or `CAST(chapter_id AS BIGINT)`
* Use integer comparison operators on `chapter_id`
* Join `chapter_id` to another table or column that expects an integer type

**How to fix:** Treat `chapter_id` as a string in all queries. Use string comparison operators and update any downstream schemas or BI tool column types accordingly.
