Power BI Premium Cost Optimization: A Technical Audit Framework

Microsoft Fabric licensing operates on a capacity-based model where organizations purchase compute units (CUs) that are shared across all workloads—Power BI, Data Factory, Synapse, and more. The billing structure creates a persistent optimization challenge: overprovision and you burn budget; underprovision and reports timeout during peak hours.

Most organizations default to overprovisioning because the alternative—angry executives staring at spinning loading icons—is politically unacceptable. But overprovisioning at F64 or P2 tier when an F32 or P1 would suffice wastes tens of thousands of dollars annually.

This article presents the systematic audit framework we apply at Montinegro Corp to right-size Fabric capacity, eliminate waste, and reduce client spend by 30-50% without degrading the end-user experience.

Understanding the Cost Structure

Before optimizing, you need to understand what you’re paying for. Fabric capacity pricing follows a tiered model:

SKU	CU/s per 30s window	Approximate Monthly Cost (USD)
F2	2	~$260
F4	4	~$520
F8	8	~$1,040
F16	16	~$2,080
F32	32	~$4,160
F64	64	~$8,320
F128	128	~$16,640

The critical detail most organizations miss: capacity is billed continuously, not per-use. An F64 running at 5% utilization at 3 AM costs the same as an F64 running at 95% utilization at 9 AM.

Phase 1: The Capacity Utilization Audit

The first step is measuring actual utilization. Microsoft provides the Fabric Capacity Metrics app, but its default views aggregate data in ways that obscure the optimization opportunities.

We run programmatic audits using the Power BI REST API and DMV (Dynamic Management Views) queries through our admin script catalog:

Workspace-Level CU Consumption

# Simplified extraction pattern
import requests

def get_workspace_consumption(access_token, capacity_id, days=30):
    """Extract per-workspace CU consumption over a time period."""
    headers = {'Authorization': f'Bearer {access_token}'}
    
    url = f'https://api.powerbi.com/v1.0/myorg/admin/capacities/{capacity_id}/workloads'
    response = requests.get(url, headers=headers)
    
    workloads = response.json()['value']
    
    # Cross-reference with workspace activity events
    activity_url = 'https://api.powerbi.com/v1.0/myorg/admin/activityevents'
    # ... aggregate by workspace, compute CU attribution

The output reveals which workspaces are consuming disproportionate capacity. In every audit we’ve conducted, the distribution follows a consistent pattern:

Top 5% of workspaces consume 60-70% of total CU
Bottom 50% of workspaces consume less than 5% of total CU
10-20% of workspaces are completely inactive (no refreshes, no queries in 30+ days)

The Zombie Workspace Problem

Inactive workspaces don’t consume CU directly, but they occupy organizational overhead: admin attention, security surface area, and confusion about what’s active. More critically, inactive workspaces often contain datasets with scheduled refreshes that nobody consumes.

We’ve found clients running 40+ daily dataset refreshes that feed reports nobody has opened in months. Each refresh consumes CU for zero business value.

Remediation: Query the activity events API for ViewReport events per workspace over 90 days. Any workspace with zero view events is a candidate for archival or deletion.

Phase 2: Refresh Schedule Optimization

Dataset refreshes are the single largest CU consumer in most Power BI deployments. A poorly scheduled refresh strategy can consume 3-5x more capacity than necessary.

The Problem: Synchronized Refreshes

Many organizations schedule all refreshes at the same time—typically 6 AM or midnight. This creates a massive CU spike that either:

Exceeds capacity limits, causing throttling and failed refreshes
Forces the organization to purchase higher-tier capacity to handle the 30-minute peak

The remaining 23.5 hours of the day, the capacity sits underutilized.

The Solution: Staggered Refresh Windows

We calculate optimal refresh schedules based on three factors:

Business criticality: Executive dashboards refresh first
Data freshness requirements: Real-time operational reports vs weekly strategic reports
Dataset size: Larger datasets get isolated time windows to avoid CU contention

Optimized Schedule Example:
────────────────────────────────────────
00:00 - 01:00  │ Tier 1: Executive dashboards (3 datasets)
01:00 - 02:00  │ Tier 2: Operations reports (8 datasets)
02:00 - 04:00  │ Tier 3: Analytical models (15 datasets)
04:00 - 05:00  │ Tier 4: Development workspaces (5 datasets)
06:00 - 06:30  │ Tier 1: Morning refresh for market-open data
12:00 - 12:30  │ Tier 1: Midday refresh
────────────────────────────────────────
Peak CU draw: ~45% (was 95%+ with synchronized schedule)

Incremental Refresh Configuration

For large datasets (>10M rows), full refreshes are wasteful. Incremental refresh policies partition the dataset by date and only refresh the most recent partition:

Full Refresh:     50M rows × 4 min = 200 CU-seconds
Incremental:      2M rows × 0.3 min = 6 CU-seconds
Savings:          97% per refresh cycle

The configuration requires proper date partitioning in the data model and RangeStart/RangeEnd parameters in Power Query. We audit every dataset above 5M rows for incremental refresh eligibility.

Phase 3: Query Performance Optimization

Slow queries consume CU proportionally to their execution time. A report that takes 15 seconds to render consumes 15x more CU than a report that renders in 1 second.

DAX Query Profiling

We use DMV queries to identify the most expensive DAX operations:

SELECT 
    [OBJECT_ID],
    [OBJECT_CPU_TIME_MS],
    [OBJECT_READS],
    [OBJECT_READ_KB],
    [OBJECT_USED_MEMORY]
FROM $SYSTEM.DISCOVER_OBJECT_ACTIVITY
ORDER BY [OBJECT_CPU_TIME_MS] DESC

The top 10 most expensive queries typically account for 40-60% of total query CU consumption. Optimizing these queries alone can reduce capacity requirements by 15-25%.

Common DAX Anti-Patterns

1. CALCULATE with unnecessary context transition

// Expensive: Forces full table scan
Bad Measure = 
CALCULATE(
    SUM(Sales[Amount]),
    ALL(Sales)
)

// Optimized: Uses SUMX only when row context is needed
Good Measure = 
SUMX(
    VALUES(Products[Category]),
    [Category Sales]
)

2. Iterator functions on large tables

// Expensive: SUMX iterates row by row over millions
Bad Revenue = 
SUMX(
    Sales,
    Sales[Quantity] * RELATED(Products[Price])
)

// Optimized: Pre-calculate in Power Query or use measure
Good Revenue = 
SUM(Sales[Revenue])  -- Column computed at refresh time

3. Excessive use of DISTINCTCOUNT

DISTINCTCOUNT is one of the most expensive DAX operations because it requires scanning and deduplicating the entire column. For high-cardinality columns (>1M distinct values), consider:

Pre-aggregating in Power Query
Using approximate distinct count if precision isn’t critical
Partitioning the calculation by date ranges

Phase 4: Capacity Right-Sizing

After optimizing refreshes and queries, the actual CU consumption profile becomes clear. We then right-size the capacity:

The Decision Matrix

Current Peak CU	Sustained Average	Recommendation
>90% of capacity	>60% average	Keep current tier
>90% of capacity	<40% average	Optimize scheduling, then reassess
<60% of capacity	<30% average	Downgrade one tier
<40% of capacity	<20% average	Downgrade two tiers

Auto-Scale Configuration

For organizations with predictable usage patterns (high during business hours, minimal overnight), Fabric supports capacity auto-scaling. We configure:

Base capacity: F16 (handles overnight refreshes and low-traffic periods)
Scale-up trigger: 75% sustained CU for 10 minutes
Scale-up target: F32 (handles business-hour query load)
Scale-down trigger: <30% sustained CU for 30 minutes
Scale-down target: Return to F16

This pattern reduces effective monthly cost by 30-40% compared to a static F32 allocation.

Phase 5: Governance Automation

Optimization is not a one-time event. Without ongoing governance, organizations drift back to inefficiency within 3-6 months as new workspaces are created, new datasets are published, and new refresh schedules are configured.

We deploy automated governance through CI/CD pipelines that:

Block oversized datasets from publishing to production (>500MB without incremental refresh configured)
Alert on refresh failures that exceed 3 consecutive occurrences
Report weekly on capacity utilization trends with anomaly detection
Flag inactive workspaces quarterly for review

The ROI Framework

For a mid-sized organization running F64 capacity:

Optimization	CU Reduction	Monthly Savings (USD)
Zombie workspace cleanup	5-10%	$400-$830
Refresh schedule staggering	15-25%	$1,250-$2,080
Incremental refresh adoption	10-20%	$830-$1,660
DAX query optimization	10-15%	$830-$1,250
Capacity right-sizing (F64→F32)	50%	$4,160
Total potential savings		$7,470-$9,980/month

Annualized, that’s $89,640-$119,760 in reduced Fabric licensing costs. The audit and optimization engagement typically costs a fraction of the first year’s savings.

The Bottom Line

Fabric capacity optimization isn’t about cutting corners. It’s about ensuring every CU consumed delivers business value. The organizations that treat capacity management as an engineering discipline—not an afterthought—consistently spend 30-50% less while delivering faster, more reliable reporting to their stakeholders.