# Sample queries

In this section, we provide several sample SQL queries to answer typical reporting needs like:

* [Users who have gained XP in the last 7 days](#users-who-have-gained-xp-in-the-last-7-days)
* [All current members of the group](#all-current-members-of-the-group)
* [Time spent in Learn per technology](#time-spent-per-technology)
* [Time spent in Learn per team](#time-spent-per-team)
* [Time spent in Learn per user](#time-spent-in-learn-per-user)
* [XP earned by user](#xp-earned-by-user)
* [Completed courses by user](#completed-courses-by-user)
* [Completed assessments by user](#completed-assessments-by-user)

Additionally, in the following section, we provide [SQL queries that replicate key reports in the Groups tab](/integrating-our-data-into-your-tools-via-data-connector-2.0/queries-to-recreate-key-reports-in-the-groups-tab.md).

{% hint style="info" %}
Courses on DataCamp can now have multiple variants: **classic** and **AI-native**. A user can complete both variants of the same course. The sample queries below return results across all variants by default. To filter by variant, add `course_variant_id` to your `WHERE` clause (e.g., `WHERE course_variant_id = 1` for classic only). See Domain Gotchas for more details.
{% endhint %}

### Users who have gained XP in the last 7 days

To track which users actively engage with the platform, you can use [XP](/understanding-reports-with-clarity-definitions.md) as a proxy for activity and look at users who have recently earned XP.

The following query returns a list of all users that earned XP in the last 7 days.

```sql
SELECT
  user.email,
  max(events.occurred_at) AS last_xp_at
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_user AS user 
  ON events.user_id = user.user_id
WHERE events.event_name IN ('assessment_engaged', 
                    'course_engagement', 
                    'practice_engagement', 
                    'project_engagement', 
                    'b2b_onboarding_xp_boost', 
                    'alpa_onboarding')
  AND date(events.occurred_at) >= date_add('day', -7, current_date)
  AND events.xp_earned > 0
GROUP BY user.email
ORDER BY last_xp_at DESC
```

{% hint style="warning" %}
In the query above, we specify the XP events to include. This is to avoid double counting XP in courses since the dataset contains  `course_engagement` and `exercise_completed` events with a non-zero `xp_earned` value. Both contain XP gained from completing an exercise, so including both would lead to double counting XP earned in courses.&#x20;

Please check the [Domain Gotchas](/integrating-our-data-into-your-tools-via-data-connector-2.0/domain-gotchas.md) section for more details.
{% endhint %}

You can also use the query above to identify your top learners for an XP competition. Simply modify the date filter to include the competition period.

### All current members of the group

Another common request is a list of the organization's users. The query below returns a list of all current members in your group and their start date.

```sql
SELECT 
  bug.user_id,
  user.email,
  bug.started_at
FROM data_connector_1234.bridge_user_group AS bug
LEFT JOIN data_connector_1234.dim_user AS user
  ON bug.user_id = user.user_id
WHERE bug.ended_at IS NULL
ORDER BY email
```

{% hint style="info" %}
**Removed users**

Please notice the `bug.ended_at` filter. If you leave this out, the query will also return the users who have been in your group but have since been removed. You can then include `bug.ended_at` in the `SELECT` clause to get the date a user left the group.
{% endhint %}

### Time spent in Learn per technology

Each content type at DataCamp has an associated [technology](/understanding-reports-with-clarity-definitions.md) (e.g., R, Python, SQL, Spark, etc.). If you want to examine what technologies your users are most focused on, you can create a report with the time spent per technology with the query below.&#x20;

```sql
SELECT
  content.technology,
  if(sum(events.duration_engaged) = 0, 0, 
    sum(events.duration_engaged) / 3600) AS hours_spent
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_content AS content
  ON events.content_id = content.content_id
WHERE content.technology IS NOT NULL
GROUP BY content.technology
ORDER BY content.technology
```

You can include a restriction in the `WHERE` clause if you would like to limit the results to a particular period.

### Time spent in Learn per team

If you want to see how different teams in your organization are taking advantage of our platform, we can modify the query above to look at the time spent per team in Learn content (assessments, courses, practices, and projects).

<pre class="language-sql"><code class="lang-sql"><strong>WITH team_data AS (
</strong>  SELECT
    user.user_id,
    team.team_name
  FROM data_connector_1234.dim_team AS team
  INNER JOIN data_connector_1234.bridge_user_team AS user
    ON team.team_id = user.team_id
  WHERE team.deleted_at IS NULL -- Team hasn't been deleted
    AND user.ended_at IS NULL -- User has not left team
)

SELECT
  teams.team_name,
  if(sum(events.duration_engaged) = 0, 0, 
    sum(events.duration_engaged) / 3600) AS hours_spent
FROM data_connector_1234.fact_learn_events AS events
INNER JOIN team_data AS teams
  ON events.user_id = teams.user_id
GROUP BY teams.team_name
ORDER BY teams.team_name
</code></pre>

{% hint style="info" %}
**On team membership and scope**

* Keep in mind that the query above only shows XP gained by the users currently in the respective team; this means that once someone leaves a team, the team's XP calculation would decrease.
* A single user can be a member of multiple teams, which means their XP is included in each team's total. Adding up these team-XP values will not match the total XP gained across all users.

*If you would like `time_spent` to remain allocated to the team even if a user has left the team, you can use the `bridge_user_team` table which has `started_at` and `ended_at` columns to calculate the period the user was part of a team.*
{% endhint %}

You can include a restriction in the `WHERE` clause of the bottom section if you want to limit the results to a particular period.

### Time spent in Learn per user

You may be interested in reviewing the time each of your organization's users has spent learning on our platform. The following query returns each user's time spent in Learn content (assessments, courses, practices, and projects).

```sql
SELECT
  user.email,
  CASE                                                                                                                                                                                                                                                      
    WHEN sum(events.duration_engaged) = 0 THEN 0
    ELSE sum(events.duration_engaged) / 3600
  END AS hours_spent
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_user AS user
  ON events.user_id = user.user_id
-- Exclude deleted users
WHERE user.email NOT LIKE('%@deleted.datacamp.com')
GROUP BY user.email
ORDER BY email
```

{% hint style="info" %}
DataLab, Certification, and other products are not included in the query above.
{% endhint %}

You can include a restriction in the `WHERE` clause if you would like to limit the results to a particular period. &#x20;

### Time spent on AI-native courses

If you want to review how much time your users are spending specifically on AI-native courses, you can filter learning events by course variant. The following query returns the total time spent per user on AI-native courses.

```sql
SELECT                                                                                                                                                                                                                                                    
    user.email,   
    CASE                                                                                                                                                                                                                                                      
      WHEN sum(events.duration_engaged) = 0 THEN 0
      ELSE sum(events.duration_engaged) / 3600
    END AS hours_spent
  FROM data_connector_1234.fact_learn_events AS events                                                                                                                                                                                                      
  LEFT JOIN data_connector_1234.dim_user AS user
    ON events.user_id = user.user_id                                                                                                                                                                                                                        
  WHERE events.course_variant_id = 2
    -- Exclude deleted users                                                                                                                                                                                                                                
    AND user.email NOT LIKE('%@deleted.datacamp.com')
  GROUP BY user.email                                                                                                                                                                                                                                       
  ORDER BY hours_spent DESC
```

{% hint style="info" %}
To compare time spent across variants, you can replace course\_variant\_id = 2 with a GROUP BY on events.course\_variant\_id to see classic and AI-native side by side. You can also break this down further by course by adding content.content\_title from dim\_content                                                                                                                                                                                                                                                     &#x20;
{% endhint %}

You can include a restriction in the WHERE clause if you would like to limit the results to a particular period.

### XP earned by user

A different way to look at engagement is to measure [XP](/understanding-reports-with-clarity-definitions.md). You can track how active each of your organization's users has been on our platform with the query below that displays the total XP per user.

```sql
SELECT
  user.email,
  sum(events.xp_earned) AS total_xp
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_user AS user 
  ON events.user_id = user.user_id
WHERE events.event_name IN ('assessment_engaged', 
                    'course_engagement', 
                    'practice_engagement', 
                    'project_engagement', 
                    'b2b_onboarding_xp_boost', 
                    'alpa_onboarding')
  AND events.xp_earned > 0
  -- Exclude deleted users
  AND user.email NOT LIKE('%@deleted.datacamp.com')
GROUP BY user.email
ORDER BY total_xp DESC
```

{% hint style="warning" %}
In the query above, we specify the XP events to include. This is to avoid double counting XP in courses since the dataset contains  `course_engagement` and `exercise_completed` events with a non-zero `xp_earned` value. Both contain XP gained from completing an exercise, so including both would lead to double counting XP earned in courses.&#x20;

Please check the [Domain Gotchas](/integrating-our-data-into-your-tools-via-data-connector-2.0/domain-gotchas.md) section for more details.
{% endhint %}

You can include a restriction in the `WHERE` clause if you would like to limit the results to a particular period.

### Completed courses by user

It is often necessary to review learning activity on a more granular level. A common question is, "What courses have our users completed?" The query below returns all courses completed by users in your organization.

```sql
SELECT
  events.user_id,
  user.email,
  content.course_id,
  content.content_title,
  events.occurred_at AS completed_at
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_content AS content
  ON events.content_id = content.content_id
LEFT JOIN data_connector_1234.dim_user AS user
  ON events.user_id = user.user_id
WHERE events.event_name = 'course_completed'
  -- Exclude deleted users
  AND user.email NOT LIKE('%@deleted.datacamp.com')
ORDER BY user.email, completed_at DESC
```

You can include a restriction in the `WHERE` clause if you would like to limit the results to a particular period.

{% hint style="info" %}
The query above returns completions across all course variants. If a user has completed both the classic and AI-native variant of a course, both completions will appear. To see which variant was completed, add `events.course_variant_id` to the `SELECT` clause. To count each course only once regardless of variant, wrap the query in a `GROUP BY` on `user_id` and `course_id`.
{% endhint %}

### Completed courses by user and variant

If you want to analyze course completions broken down by course variant (classic vs AI-native), use the following query:

```sql
SELECT
  events.user_id,
  user.email,
  content.course_id,
  content.content_title,
  variant.course_variant_name,
  events.occurred_at AS completed_at
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_content AS content
  ON events.content_id = content.content_id
LEFT JOIN data_connector_1234.dim_course_variant AS variant
  ON events.course_variant_id = variant.course_variant_id
  AND content.content_id = variant.content_id
LEFT JOIN data_connector_1234.dim_user AS user
  ON events.user_id = user.user_id
WHERE events.event_name = 'course_completed'
  AND user.email NOT LIKE('%@deleted.datacamp.com')
ORDER BY user.email, completed_at DESC
```

This query joins `dim_course_variant` to display the variant name alongside each completion. You can filter to a specific variant by adding `AND events.course_variant_id = 1` (classic) or `AND events.course_variant_id = 2` (AI-native) to the `WHERE` clause.

### Completed assessments by user

Testing a user's skill level is an integral part of learning. If you want a report of all the assessments your organization's users have completed, the query below will tell you all complete assessments and their user, score, percentile, and completion date.

```sql
SELECT
  user.email,
  content.content_title AS assessment_title,
  events.assessment_score,
  events.assessment_percentile,
  events.assessment_knowledge_level,
  events.occurred_at AS completed_at
FROM data_connector_1234.fact_learn_events AS events
LEFT JOIN data_connector_1234.dim_content AS content
  ON events.content_id = content.content_id
LEFT JOIN data_connector_1234.dim_user AS user
  ON events.user_id = user.user_id
WHERE events.event_name = 'assessment_completed'
ORDER BY completed_at DESC
```

You can include a restriction in the `WHERE` clause if you would like to limit the results to a particular period.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://enterprise-docs.datacamp.com/integrating-our-data-into-your-tools-via-data-connector-2.0/sample-queries.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
