How Recursive CTEs Unlock Practical, Scalable Navigation of Organizational Hierarchies in SQL
Ludek Stehlik, Ph.D.'s recent article, "A graph approach to reporting structures: stop wrestling recursion," offers an insightful exploration of graph-based roll-up metrics. His approach prompted me to reflect on additional ways teams can navigate reporting hierarchies in SQL. In this post, I’d like to add to the conversation by sharing how recursive CTEs can also play a valuable role in these scenarios. While graph-based metrics are excellent for complex network-like hierarchies, recursive CTEs are particularly suited for straightforward, layer-by-layer hierarchical navigation. By considering both methods side by side, analysts can select the most suitable tool for their specific scenario and expand the toolkit available for addressing various organizational data challenges. Here’s why recursive CTEs are often a practical fit for modeling organizational hierarchies:
- Straightforward Adoption: Recursive CTEs build on standard SQL, making them accessible for analysts and engineers alike. This reduces the learning curve and integrates seamlessly into established data practices.
- Broad Compatibility: Recursive CTEs are supported across nearly all modern SQL databases, such as Snowflake, Redshift, and BigQuery. This ensures that solutions remain portable without relying on niche features or vendor-specific extensions, providing reassurance that a wide range of database environments can seamlessly adopt these constructs.
- Efficient for Typical Hierarchies: For most organizational charts and reporting structures, recursive CTEs deliver the needed results efficiently, without the overhead of more complex tooling.
- Seamless Integration: Recursive CTEs slot directly into current SQL workflows and BI dashboards, allowing teams to extend their analytics without rethinking architecture.
What is a Recursive CTE?
In technical terms:
A recursive CTE (Common Table Expression) is a SQL construct that enables a query to reference itself, allowing for the retrieval of hierarchical or recursive data structures, such as organizational charts or tree structures.
In less technical terms:
A recursive CTE is like having a set of instructions to identify everyone in a company who reports directly or indirectly to the CEO. You start with the CEO at the top. Then, you look for all the people who report to the CEO. Next, you look for everyone who reports to those people, and you continue this process, going level by level, until you’ve found everyone in the organization. It’s a method of working step by step through the company’s structure, ensuring that no one is missed, regardless of the number of layers between the CEO and the rest of the staff.
Breaking Down the Structure of a Recursive CTE
A recursive CTE has four main parts:
- The first SELECT statement (the anchor member): Consider this the starting point. It finds the CEO, the person at the very top of the organization.
- The second SELECT statement (the recursive member): This part is responsible for building out the rest of the organization. It looks for everyone who reports directly to the CEO. Then, for each of those people, it repeats the process, finding their direct reports, and so on. It continues, layer by layer, until it has reached everyone in the company, regardless of their position in the hierarchy.
- The UNION ALL statement: This combines the CEO’s record with those of everyone else.
- The search boundary (this is important!): Despite our best intentions, it’s possible with recursive CTEs to end up in a situation where the query is in an infinite loop. This commonly happens if a record in your data has the employee reporting to themselves. If this happens, the query will continue to join the employee back to themselves, and the process will never end. We use the WHERE clause in the recursive member to prevent this. I like to use a level restriction for this. The CEO starts at level 0, and for each iteration down the hierarchy, the level increases.
Termination of the recursive CTE occurs when either there are no more records to join on or the condition of the WHERE clause in the recursive member evaluates to TRUE.
Here’s a sample of what a basic recursive CTE may look like:
with recursive org (worker_id, manager_id, level) as (
--Anchor member
select
emp.worker_id,
emp.manager_id,
0 as level
from employee as emp
where emp.manager_id is null
union all
--Recursive member
select
emp.worker_id,
emp.manager_id,
org.level + 1 as level
from employee as emp
inner join org
on emp.manager_id = org.worker_id
where org.level < 10
)
select * from org
Going Beyond Basics: Capturing Reporting Paths
I like to add two path fields to my recursive CTEs for employee hierarchies.
- path_id - A concatenated list of IDs of all the employees this person reports up through.
- path_name - A concatenated list of names of all the employees this person reports up through.
Here’s what these fields end up looking like in practice:
worker_id | preferred_fullname | manager_id | job_title | level | path_id | path_name |
1001 | David Wallace | null | CEO | 0 | null | null |
1002 | Michael Scott | 1001 | Regional Manager | 1 | 1001 | David Wallace |
1003 | Dwight Schrute | 1002 | Assistant to the Regional Manager | 2 | 1001 / 1002 | David Wallace / Michael Scott |
1004 | Jim Halpert | 1002 | Senior Sales Representative | 2 | 1001 / 1002 | David Wallace / Michael Scott |
1005 | Pam Beesly | 1002 | Receptionist | 2 | 1001 / 1002 | David Wallace / Michael Scott |
1006 | Ryan Howard | 1002 | Intern | 2 | 1001 / 1002 | David Wallace / Michael Scott |
These fields come in handy later in subsequent work when we want to start calculating metrics for each leader, such as:
- Total Number of Reports - The total number of workers reporting to a leader, including both direct and indirect reports within their organizational span.
- Number of Direct Reports - The count of workers who report directly and immediately to a leader.
- Number of Indirect Reports - The count of workers who report to the leader through one or more layers of management (i.e., not directly).
- Level From Top - The number of steps or layers separating a leader from the topmost position in the organization (such as the CEO).
- Levels In Org - The number of hierarchical layers beneath a leader, down to the most junior employee within their organization. This measures the depth of the leader’s reporting structure.
Example Implementation
Here’s a SQL code sample that you can modify and adapt to use in your own organization.
with recursive org (worker_id, preferred_fullname, manager_id, job_title, level, path_id, path_name) as (
--Anchor member
select
emp.worker_id,
emp.preferred_fullname,
emp.manager_id,
emp.job_title,
0 as level,
cast(emp.worker_id as varchar(1000)) as path_id,
cast(emp.preferred_fullname as varchar(1000)) as path_name
from employee as emp
where emp.manager_id is null
union all
--Recursive member
select
emp.worker_id,
emp.preferred_fullname,
emp.manager_id,
emp.job_title,
org.level + 1 as level,
concat(concat(org.path_id, ' / '), emp.worker_id) as path_id,
concat(concat(org.path_name, ' / '), emp.preferred_fullname) as path_name
from employee as emp
inner join org
on emp.manager_id = org.worker_id
where org.level < 10
),
--Calculate total reports & max level by organization level
total_unclean as (
select
trim(split_part(rtrim(replace(path_id, worker_id, ''), '/'), '/', 1)) as ceo,
trim(split_part(rtrim(replace(path_id, worker_id, ''), '/'), '/', 2)) as l1,
trim(split_part(rtrim(replace(path_id, worker_id, ''), '/'), '/', 3)) as l2,
trim(split_part(rtrim(replace(path_id, worker_id, ''), '/'), '/', 4)) as l3,
count(worker_id) as num_total_reports,
max(level) as max_level
from org
group by grouping sets(ceo, l1, l2, l3) order by 1, 2, 3
),
--Consolidate manager IDs
total as (
select distinct
coalesce(nullif(ceo, ''), nullif(l1, ''), nullif(l2, ''), nullif(l3, '')) as manager_id,
num_total_reports,
max_level
from total_unclean
),
--Calculate direct reports
direct as (
select
manager_id,
count(worker_id) as num_direct_reports
from org
group by manager_id
)
--Join it back together
select
org.worker_id,
org.preferred_fullname,
org.manager_id,
org.job_title,
org.level,
nullif(rtrim(replace(org.path_id, org.worker_id, ''), ' / '), '') as path_id,
nullif(rtrim(replace(org.path_name, org.preferred_fullname, ''), ' / '), '') as path_name,
nullif(trim(split_part(rtrim(replace(org.path_name, org.preferred_fullname, ''), ' / '), '/', 2)), '') as l1,
nullif(trim(split_part(rtrim(replace(org.path_name, org.preferred_fullname, ''), ' / '), '/', 3)), '') as l2,
nullif(trim(split_part(rtrim(replace(org.path_name, org.preferred_fullname, ''), ' / '), '/', 4)), '') as l3,
coalesce(dir.num_direct_reports > 0, false) as is_manager,
coalesce(tot.num_total_reports, 0) as num_total_reports,
coalesce(dir.num_direct_reports, 0) as num_direct_reports,
tot.max_level - org.level as levels_in_org
from org
left join total as tot
on org.worker_id = tot.manager_id
left join direct as dir
on org.worker_id = dir.manager_id
Closing Thoughts
Once you get comfortable with the code, there are countless ways to expand its use. For instance, you can modify it to build employee rosters, incorporate logic to count only specific types of employees in your metrics, or layer in additional data to enable metrics such as the number of top performers for each leader. You might even pivot the data for a deeper look at span of control across your organization.
Ultimately, recursive CTEs remain a powerful and accessible tool for navigating organizational hierarchies in SQL. By understanding their structure and capabilities, you can surface new insights into your company’s reporting dynamics. Whether used alone or in conjunction with graph-based approaches, recursive CTEs provide analysts and engineers with a flexible and scalable method for modeling, analyzing, and adapting to the evolving needs of modern organizations.
Reference:
Stehlík, L. (2025, October 31). Ludek's blog about people analytics: A graph approach to reporting structures: Stop wrestling recursion. https://blog-about-people-analytics.netlify.app/posts/2025-10-31-org-stats-and-graph-analysis/