Managing large data volumes (LDV) in Salesforce is a challenge that affects performance, scalability, governor limits, data integrity, and user experience. Whether you’re dealing with millions of records or high-velocity transactions, following best practices ensures your org remains fast, stable, and compliant with Salesforce limits.

What is Large Data Volumes (LDV) in Salesforce

In Salesforce, Large Data Volumes (LDV) refers to scenarios where you are working with hundreds of thousands, millions, or even billions of records in one or more objects. When the data grows to this scale, standard operations—like queries, reports, list views, triggers, sharing recalculations, and data loads—begin to slow down or hit governor limits.

Salesforce does not have a clear instruction for LDV by a fixed number, as it depends on the Org structure and size. In real-world projects, below can be LDV scenarios.

Object which has more than 5 million of records
Tens of thousands of users are accessing the system simultaneously
Lookup/Parent object with more than 10K child or related records
Object has a large number of fields or very large records, such as large text areas or files.
Org storage has grown to 100GB

Based on the object record size, below are the impacts on the system

Data Size	Typical Impact
> 300,000 records	Queries start requiring indexes
> 1 million records	Full-table scans cause timeouts
> 5 million records	Triggers, sharing, reports slow down
> 10 million records	Archival strategy becomes essential
> 50 million+ records	Big Objects or external systems recommended

It simply says that if a Salesforce org has more data than Salesforce can efficiently process using the default configuration, then it is LDV.

Why LDV Matters

When objects reach LDV, it impacts:

Performance (slow queries, timeouts)
Data load failures
Sharing recalculation delays
Reports taking too long
Record locking issues
Long-running Apex jobs
Governor Limits exceptions

So Salesforce recommends special design patterns and best practices to ensure the system stays scalable and responsive.

Below are the top 10 best practices, each explained with clarity and real-world examples.

1. Choose the Right Data Model (Avoid Too Many Lookups & Unnecessary Relationships)

Lookup fields and relationships are powerful, but when overused, especially at scale, they become a serious performance bottleneck. Each lookup adds an internal join, and Salesforce databases are optimized for CRM transactions, not heavy relational complexity.

When a record has many lookup fields, Salesforce must validate and fetch related data every time you:

View a record
Run a report
Execute a SOQL query
Load or update data

More lookups = more joins = slower performance.

How to handle the lookup issue:

Use master-detail instead of lookup when you need roll-ups and tight coupling.
Archive or delete unused fields and relationships.
Avoid many-to-many relationships for high-volume objects

Example:

A Service Cloud org had 20 lookup fields on Case. As data grew, record save time shot up to 3–5 seconds. Optimizing the data model reduced query time drastically.

Refer to the post The Hidden Risks of Overusing Lookups in Salesforce for more information about this topic.

2. Index Your Fields & Use Selective SOQL Queries

As your Salesforce data grows into the millions, even simple queries can struggle. Reports start timing out, list views take forever to load, and integrations begin returning “query too expensive” errors. The root cause behind most of these performance issues is non-selective SOQL — queries that force Salesforce to scan the entire object instead of using indexes.

Indexes act like shortcuts that allow Salesforce to jump directly to the right records, instead of reading everything row by row. When your filters use indexed fields, the system becomes significantly faster, more stable, and more scalable, even with Large Data Volumes (LDV).

A non-selective query forces Salesforce to check every record in the object. With millions of rows, this means:

Long response times
Timeouts
Heavy CPU usage
Increased load on the database

This affects every user and background process, not just your query.

How to handle a non-selective query:

Use indexed fields—primary key ID, name, owner, audit fields, and external IDs.
Raise a Salesforce case to add custom indexes where needed.
Avoid operators that break selectivity:NOT, !=, CONTAINS, LIKE '%text'

Refer to the post Optimize SOQL Filter in Apex Code to handle this issue.

3. Use Skinny Tables for Performance Optimization

When objects grow into millions of records, retrieving even a few frequently used fields can become slow. Reports take longer to load, list views feel sluggish, and API queries struggle. Skinny Tables offer a high-performance shortcut by storing only selected fields in a slimmer, faster-access table behind the scenes.

Skinny Tables will solve issues like

Slow queries due to too many fields on an object.
Performance delays in list views and reports.
Heavy record reads because Salesforce retrieves all fields—even when only a few are needed.
Large objects causing table scans during peak load.

Skinny tables act like a performance booster when your object becomes too “heavy” to query efficiently.

Check out the post “What is a Skinny Table?” for more details about the skinny tables.

4. Archive or Purge Old Data Regularly

Salesforce is not designed to store unlimited historical data like customer transaction histories. When old, irrelevant, or unused records pile up, they quietly drag down performance across the entire Salesforce Org. Archiving helps keep your active data lean, fast, and easier to query by offloading the noise into cheaper storage.

Archive will Solve

Slow reports due to millions of old records.
Non-selective SOQL caused by outdated data that still matches broad filters.
Long-running sharing recalculations on large objects.
Storage costs are skyrocketing for no business value.
Batch jobs are taking hours because of unnecessary old data.

Archiving ensures your org stays fast, clean, and compliant with LDV best practices.

Refer to the post The Ultimate Guide to Data Cleanup Techniques for Salesforce for more details about archiving or purging old data.

5. Use Async Processing (Batch, Queueable, Future, Platform Events)

Large data operations—like mass updates, recalculations, or bulk integration loads—cannot run in real-time without hitting limits. Asynchronous processing moves heavy work into the background so the system stays responsive while still delivering reliable throughput for big jobs.

Async Processing will Solve

CPU limit errors during large updates.
Too many DML rows in a single transaction.
Timeouts in synchronous processes like flows or triggers.
Integration failures when pushing large workloads.
Sequential dependency problems in automations

Async processing isn’t just a performance practice—it’s essential for LDV stability.

Refer to the post – Revisit Asynchronous Apex : Type and Usage to learn about Async jobs.

6. Optimize Triggers: One Trigger Per Object

When your org has millions of records, inefficiencies in triggers become amplified. Multiple triggers, unbulkified logic, SOQL inside loops, and scattered processing can cause major failures. A clean, unified trigger architecture ensures predictable, scalable behavior—even under heavy load.

Optimized Trigger will solve

SOQL/DML governor limit errors due to poor bulk handling.
Complex, unmaintainable code scattered across multiple triggers.
Conflicting logic leading to data inconsistency.
High CPU usage in LDV environments.
Slow data loads because triggers fire dozens of times unnecessarily.

One well-structured trigger per object is a foundation for LDV-safe automation.

Refer to the post Top Mistakes Developers Make in Salesforce Apex Triggers to learn trigger optimization.

7. Use Big Objects for Massive Storage

When your business needs to store hundreds of millions or billions of historical or event-based records, standard objects simply cannot scale. Big Objects provide massive, high-volume storage without affecting CRM performance, making them perfect for logging, auditing, or long-term history.

Big Objects will solve

Standard object storage limits being reached too quickly.
Slow queries on historical datasets.
High storage costs for rarely accessed data.
Performance degradation due to old records mixed with current data.
Inability to store IoT/transaction logs in standard Salesforce objects.

Big Objects help you scale beyond CRM constraints.

8. Leverage External Systems (Data Virtualization Over Synchronization)

Not all business data should be stored inside Salesforce. Storing everything—orders, logs, analytics data, inventory, events—causes unnecessary load. By keeping large or non-critical datasets in external systems and referencing them only when needed, you maintain speed while avoiding storage overload.

Issues This Solves

Massive storage consumption from non-CRM data.
Slow performance due to oversized objects.
Integration failures caused by overloading Salesforce with data it shouldn’t store.
Long migration times because of bloated tables.
High API usage from syncing data unnecessarily.

Virtualize what you don’t need locally, and keep Salesforce focused on CRM.

Refer to the post Build Scalable Solutions with Salesforce for more information.

9. Optimize Reporting, Dashboards & List Views

Reports and list views are some of the biggest hidden performance killers in LDV environments. Broad filters, unbounded queries, and cross-object conditions can easily cause timeouts or slow user experience. Optimizing reporting ensures fast insights without dragging down the system.

Optimized reporting will solve

Slow report execution on millions of rows.
Dashboard refresh failures.
List views taking 10–20 seconds to load.
Non-selective filters causing full table scans.
Heavy load on database from inefficient report definitions.

Smart reporting design keeps analytics fast and reliable even at scale.

Refer to the post Optimize Salesforce Reports and Dashboard for more information about reporting optimization.

10. Use Data Skew Avoidance Techniques

Large Data Volumes often lead to scenarios where too many records depend on a single parent, owner, or lookup value. This creates data skew, which triggers locking issues, slow saves, and even job failures. Distributing data ownership and relationships helps maintain performance and concurrency.

Avoiding Data Skew will solve

Record locking during updates.
Slow save operations when the same parent record is overused.
Batch jobs failing with “UNABLE_TO_LOCK_ROW”.
Sharing recalculation delays for skewed users or accounts.
System contention when thousands of records hit the same parent.

Skew prevention is one of the biggest keys to stability in LDV orgs.

Refer to the post What is data skew for more information.

Summary

Salesforce is highly scalable, but LDV needs a careful combination of design, indexing, async processing, and archival strategy.

Using these top 10 practices ensures that your org:
✔ Remains performant
✔ Scales with business growth
✔ Avoids governor limits
✔ Supports better reporting
✔ Improves user experience

How to Effectively Manage Large Data Volumes in Salesforce?

Salesforce Outbound Message vs Platform Event: A Complete Architect’s Guide

Top Salesforce Integration Challenges and How to Solve Them

How to Set Up Single Sign-On (SSO) Between Okta and Salesforce

Quick Links

Salesforce Architect

Salesforce Developer