Back to Blog
Privacy & Security

Excel Pivot Table Caches: The Hidden Data You Forgot to Remove

You delete the source data sheet, save the file, and ship the dashboard. The recipient opens the workbook, double-clicks the pivot table grand total, and a brand-new sheet appears containing every row you thought you removed — salaries, costs, customer names, the lot. Welcome to the pivot cache: a full, embedded snapshot of source data that lives inside the pivot table itself, completely independent of the source range. It is one of the most common and damaging metadata leaks in Excel, and almost no one knows it is there.

Privacy & Security Team
April 17, 2026
19 min read

What the Pivot Cache Actually Is

When you create a pivot table, Excel does not simply read your source range each time you change a filter or drag a field. Instead, it builds an internal data structure called the pivot cache: a compressed, deduplicated, columnar copy of the source data, stored inside the workbook file. Every interaction with the pivot table reads from this cache, not from the source range.

This design is a performance win — pivot tables stay fast even on millions of rows — but it has a critical side effect: the source data now exists in two places. If you delete the source sheet, hide it, or even point the pivot at an external file that recipients cannot access, the cache still contains the data. It travels with the workbook, and it does not require the source to function.

The Core Misconception

Most users believe that “deleting the source sheet” removes the data from the workbook. It does not. The pivot cache is a completely separate copy of the source rows, stored inside an XML part called pivotCacheRecords. Deleting the source sheet only removes the visible source — the cache, with every original row intact, remains embedded in the file you ship.

Where the Cache Lives Inside the XLSX File

Every XLSX file is a ZIP archive containing XML parts. When a workbook contains a pivot table, two new directories appear inside the archive:

  • xl/pivotCache/pivotCacheDefinition1.xml — defines the cache schema: field names, data types, the source reference, and the unique values seen in each column.
  • xl/pivotCache/pivotCacheRecords1.xml — contains every individual record from the source data, row by row, encoded as references into the cache definition’s shared item lists.
  • xl/pivotTables/pivotTable1.xml — describes the visible layout (rows, columns, values, filters). This part references the cache but does not duplicate the data.

The records file is the leak. Open any XLSX with a pivot table by renaming it to .zip and unzipping it, then look at pivotCacheRecords1.xml. You will see one <r> element per source row, with each cell value stored either inline or as an index into the shared item list.

// Excerpt from pivotCacheRecords1.xml

<pivotCacheRecords count="5247">

<r>

<s v="Acme Corp"/> <!-- customer name -->

<s v="Sarah Johnson"/> <!-- account rep -->

<n v="187500"/> <!-- contract value -->

<n v="0.42"/> <!-- discount applied -->

</r>

</pivotCacheRecords>

Note the count="5247" attribute on the root element. That is 5,247 source rows, all preserved verbatim, in a workbook whose visible pivot table summarizes them down to a 12-row by 4-column grid. The summary is what the recipient sees first; the raw rows are what they get if they know to look.

The “Show Details” Trapdoor

The recipient does not need to know XML or rename files. Excel itself ships a built-in feature that reads the cache and dumps it onto a new sheet: double-clicking any value cell in a pivot table. This invokes the “Show Details” (also called “Drill Down”) action, which generates a fresh worksheet containing every source row that contributed to the value.

Double-Click is the Attack

No special tools, no XML parsing, no expertise required. Anyone who opens the file in Excel can double-click a single number in the pivot table and instantly see all underlying source rows for that intersection. Double-click the grand total cell and Excel will materialize the entire cache onto a new sheet. This is a feature, not a bug — and almost no recipient knows you did not intend it.

Show Details is enabled by default on every pivot table. It can be disabled per-pivot through PivotTable Options → Data → uncheck “Enable show details,” but this only blocks the double-click shortcut. The cache itself remains in the file, so anyone determined to extract the data can still do so via XML inspection, third-party tools, or by re-enabling the option (it is a per-pivot setting, not a permission).

Why Deleting the Source Sheet Does Not Help

A common cleanup workflow looks like this: build the pivot table, copy it to a new sheet, delete the source data sheet, save, share. Surely the data is gone now? It is not. Here is what actually happens at the XML level:

ActionWhat ChangesWhat Stays
Delete source sheetThe visible worksheet XML is removedpivotCacheRecords1.xml — intact
Hide source sheetSheet’s state attribute set to “hidden”Source sheet AND cache — both intact
Very-hide source sheetSheet hidden via VBA, not visible in Excel UISource sheet AND cache — both intact
Convert pivot to valuesPivot becomes a static rangeCache may persist if pivot definition remains
Save as XLSXNo structural cleanup occursEverything above — intact

The cache is bound to the pivot table, not to the source sheet. As long as the pivot table exists, the cache exists, and as long as the cache exists, the source data is shippable. Worse, hiding the source sheet creates a false sense of security — recipients can right-click any sheet tab and select “Unhide” to reveal it. “Very-hidden” sheets created via VBA are slightly harder to surface but trivial for anyone with basic Excel knowledge to expose.

External Sources: The Cache Still Travels With You

Pivot tables can be built on external data: SQL databases, OLAP cubes, Power Query connections, or other Excel files. Many users assume that this protects them — if the data lives in a database the recipient cannot reach, surely the workbook cannot leak it. This assumption is wrong.

When you save the workbook, Excel writes the cache to disk by default, even for external sources. The recipient does not need credentials to the source database; they have a snapshot of the query result, frozen at the moment you last refreshed, sitting inside pivotCacheRecords1.xml. A pivot built against your sensitive HR database will gladly leak salaries to an external recipient who has no database access whatsoever.

“Save Source Data With File”

PivotTable Options → Data tab includes a checkbox titled “Save source data with file.” It is checked by default. Unchecking it tells Excel to discard the cache before saving and to refresh on next open. This is the single most important setting for protecting external-source pivot tables — but it must be set before the file is shared, and the option to “Refresh data when opening the file” should also be enabled so the pivot still works.

How This Goes Wrong in Practice

The pivot cache leak is not a theoretical risk — it is one of the most frequent vectors for accidental data exposure. A few representative scenarios:

Scenario 1: The Salary Bands Report

HR builds a report showing average compensation by job level for the leadership team. The source sheet contains every employee’s individual salary, name, and manager. HR “cleans up” the file by deleting the source sheet, leaving only the pivot summary. The file is shared with all department heads.

Risk: Any department head double-clicks the “Average Salary” cell for their level and instantly sees every employee’s name, salary, and manager assignment. The cache also lets them double-click other levels they should not see.

Scenario 2: The Investor Deck Backup

Finance prepares an Excel file showing high-level revenue by product line for an investor update. The source data — raw transactional records with customer names, contract terms, and individual deal sizes — is in a hidden sheet. The file is sent to twenty external investors.

Risk: A single investor right-clicks any sheet tab, selects “Unhide,” and sees the entire transactional ledger. Even if the sheet is not unhidden, the pivot cache contains the same rows and double-clicking any pivot value reveals them.

Scenario 3: The Vendor Bid Comparison

Procurement summarizes bids from five vendors in a pivot table for the executive team. The source sheet lists each vendor’s pricing, including the line items each one asked Procurement to keep confidential. The summarized file is forwarded to one of the bidding vendors as part of a clarification request.

Risk: The receiving vendor double-clicks a competitor’s row total and obtains the full line-item pricing of every other bidder — competitive intelligence that destroys the integrity of the procurement process.

Scenario 4: The Patient Outcomes Dashboard

A clinical research team produces a pivot-based dashboard showing outcomes by treatment group, intended for sharing with academic peers. The source sheet contains patient identifiers, dates of treatment, and individual diagnoses. The team thinks they have aggregated the data; in reality, the cache holds every patient row.

Risk: The shared file is a HIPAA breach. PHI is fully present in the cache, accessible via Show Details or XML inspection, and the breach is reportable regardless of whether anyone actually accessed the data.

How to Verify Whether a File Has a Cache Leak

You can confirm in under a minute whether any XLSX file you are about to share contains a populated pivot cache. There are three reliable methods:

Method 1: The Double-Click Test

Open the file in Excel. Find the pivot table’s grand total cell — the bottom-right number. Double-click it. If a new worksheet appears with rows of data, the cache contains those rows and they will travel with the file. If you see an error like “The PivotTable report was saved without the underlying data,” the cache has been suppressed and the file is safer (assuming the option remains set).

Method 2: The XML Inspection

Make a copy of the XLSX, rename the copy from .xlsx to .zip, and open it in any archive viewer. Navigate to xl/pivotCache/. If you see pivotCacheRecords1.xml with a non-trivial file size (typically more than a few hundred bytes), the source data is embedded. Open the file in a text editor to see exactly what is in there.

# Quick command-line check (Mac/Linux/WSL)

unzip -l report.xlsx | grep pivotCacheRecords

# If the output shows a file with size > 1KB, the cache is populated

# Inspect the contents:

unzip -p report.xlsx xl/pivotCache/pivotCacheRecords1.xml | head

Method 3: A Metadata Analysis Tool

Tools like MetaData Analyzer parse XLSX files structurally and report on pivot cache presence, record counts, and the field schema embedded in the cache definition. This is the fastest way to audit large numbers of files and the only practical approach for organizations that need to enforce pre-share policies at scale.

How to Actually Remove a Pivot Cache

Cleaning a pivot cache properly requires deliberate action — saving the file, copying values, or deleting source sheets are all insufficient. Use one or more of the following techniques:

Option A: Disable “Save Source Data With File”

  1. Click anywhere inside the pivot table.
  2. Go to PivotTable AnalyzeOptions (or right-click the pivot → PivotTable Options).
  3. Switch to the Data tab.
  4. Uncheck “Save source data with file.”
  5. Check “Refresh data when opening the file” if the source is still accessible to recipients.
  6. Click OK and save the file.

Excel will discard the cache contents on save. Inspecting the resulting XLSX shows pivotCacheRecords1.xml shrunk to almost nothing. Important caveat: this must be repeated for every pivot table in the workbook — the setting is per-pivot, not workbook-wide.

Option B: Convert Pivots to Static Values

  1. Select the entire pivot table.
  2. Copy it (Ctrl+C / Cmd+C).
  3. Paste Special → Values over the same range, or onto a new sheet.
  4. Delete the original pivot table.
  5. Save the file.

When you delete the last pivot table that references a given cache, Excel removes the cache. Verify by inspecting the XLSX afterwards — the xl/pivotCache/ directory should be gone or empty.

Option C: Document Inspector

  1. Save your file (Document Inspector is destructive).
  2. File → Info → Check for IssuesInspect Document.
  3. Ensure “PivotTables, PivotCharts, Cube Formulas, Slicers, and Timelines” is checked.
  4. Click Inspect, then Remove All in the section that lists pivot artifacts.

Document Inspector removes the pivot tables, slicers, timelines, and their associated caches in one step. The downside is that it removes the pivots entirely, leaving recipients with no interactive summary. Combine with Option B if you need the visible pivot output preserved as static values.

Option D: Programmatic Removal

For automated pipelines, a small Python script using openpyxl can iterate every pivot in a workbook and toggle cache.saveData = False, then save. For maximum certainty, post-process the resulting XLSX with a ZIP library to delete every file under xl/pivotCache/ and rewrite the relationship parts that referenced them. This is the most robust approach for compliance-sensitive workflows where every file must be audited before egress.

# Strip pivot caches from every XLSX in a directory

import zipfile, shutil, os

def strip_pivot_caches(path):

tmp = path + ".tmp"

with zipfile.ZipFile(path) as zin, zipfile.ZipFile(tmp, "w") as zout:

for item in zin.infolist():

if "pivotCacheRecords" in item.filename:

continue # drop the cache records entirely

zout.writestr(item, zin.read(item.filename))

shutil.move(tmp, path)

Caveat: Deleting Cache Files Breaks the Pivot

Brute-force deletion of pivotCacheRecords*.xml files removes the data but leaves the pivot table without records to display. Excel will show empty pivots until the user refreshes, which requires the source to be accessible. For workflows where the file must remain self-contained and functional, prefer Option B (convert to static values) so recipients see the same numbers without any interactive cache.

Beyond the Records: Other Pivot Metadata to Watch

Even after stripping pivotCacheRecords1.xml, the cache definition file itself contains additional metadata that is worth understanding:

  • Field schema — Every column name from the source is preserved in the cache definition, even if you suppress the records. A column named “Sensitive_Salary_Adjustment” tells the recipient something even if no values are present.
  • Shared item lists — The cache definition stores the unique values that appear in each column. For low-cardinality fields (department, region, customer name) the entire universe of values is preserved. Suppressing records does not always remove these.
  • Connection strings — For pivots built on external sources, the cache definition stores the connection details: server names, database names, sometimes embedded credentials. Inspect xl/connections.xml as part of any cleanup.
  • Refresh metadatarefreshedBy, refreshedDate, and createdVersion attributes leak the username, exact timestamp, and Excel build of the last refresh. This is identity metadata that survives most cleanup workflows.
  • Calculated fields and items — Custom formulas defined inside the pivot are stored in the cache definition. These can reveal proprietary calculations, allocation logic, or pricing rules that you may not want disclosed.

Pre-Share Checklist for Workbooks Containing Pivot Tables

Run through this list before sending any Excel file that contains, or has ever contained, a pivot table:

  • Have I identified every pivot table in the workbook, including pivots on hidden or very-hidden sheets?
  • For each pivot, have I either disabled “Save source data with file” or converted the pivot to static values?
  • Have I run the double-click test on the grand total of every remaining pivot to confirm no source data appears?
  • Have I inspected the XLSX as a ZIP and confirmed that xl/pivotCache/pivotCacheRecords*.xml files are absent or trivially small?
  • Have I checked xl/connections.xml for embedded server names, database paths, or credentials in connection strings?
  • Have I cleared refreshedBy and other identity metadata from any remaining cache definition files?
  • Have I unhidden every sheet in the workbook to confirm there are no source sheets I forgot about?
  • Have I run Document Inspector and acted on every pivot-related finding it reports?
  • Have I verified the final file with a metadata analysis tool to catch anything I missed?

Organizational Recommendations

Individual diligence is necessary but not sufficient. Organizations that handle sensitive data in spreadsheets should institutionalize cache hygiene through several mechanisms:

Default-Off Cache Saving

Provide team templates with “Save source data with file” pre-disabled and “Refresh on open” pre-enabled. Most users never change these defaults; setting the correct ones at the template level eliminates the issue at scale.

Egress Scanning

Integrate pivot cache detection into DLP and email-attachment scanning pipelines. Flag any outbound XLSX whose pivotCacheRecords exceed a threshold size, and require manual review before delivery.

Sensitivity-Class Policies

For files classified as Confidential or higher, mandate pivot conversion to values before any external sharing. Use Microsoft Information Protection labels to enforce conditional cleanup workflows automatically.

Training With the Demonstration

Most users do not change behavior until they see the leak themselves. Build a five-minute training that walks them through the double-click test on a representative file. The visceral “wait, where did that come from?” moment is the most effective teaching tool available.

Conclusion

Pivot tables are one of Excel’s most useful features, and the cache that powers them is what makes them fast. But the cache is also a full, faithful copy of source data, persisted inside the workbook in a format that any recipient can extract with a double-click. Deleting source sheets, hiding sheets, and copying values do not remove this cache — only deliberate steps targeting the cache itself will.

The fix is straightforward once you know to look for it: disable “Save source data with file” on every pivot before sharing, or convert pivots to static values when the recipient does not need interactivity. Verify with the double-click test, with XML inspection, or with an automated tool. Treat any XLSX containing a pivot as a workbook that contains its source data, until you have personally confirmed otherwise.

The pattern is consistent across most Excel metadata risks: features designed for productivity create persistence layers that survive naive cleanup. The pivot cache is an especially severe instance because the data leak is total — not a stray name or timestamp, but every row of the original source. Treat it accordingly.

Detect Hidden Pivot Cache Data in Your Excel Files

Use MetaData Analyzer to inspect your XLSX files for pivot cache records, embedded source data, connection strings, and other hidden metadata before sharing. See exactly what your spreadsheets are leaking.