Back to Blog
Technical

Excel Threaded Comments and Persona IDs: How Modern Co-Authoring Comments Expose Microsoft 365 Tenant Identities

Modern Excel ships two completely separate comment systems inside the same workbook. The legacy notes that yellow-stickie veterans remember live in xl/comments1.xml. The newer threaded comments — the ones that look like Word review bubbles, support replies, and let you @-mention colleagues — live in a parallel folder, xl/threadedComments/, alongside a separate directory of personas in xl/persons/person.xml. The persona file is the part nobody looks at. Inside it, each commenter is identified by a long persona ID that resolves directly to a Microsoft 365 tenant GUID, an Active Directory object ID, or an email address — even when the visible display name has been edited or the comment thread itself has been resolved and hidden. This post walks through the structure of threaded comments, what each field reveals about your organisation and your collaborators, why the Document Inspector’s comment toggle does not fully erase the persona layer, and how to strip every trace before sharing workbooks externally.

Technical Team
May 7, 2026
21 min read

Two Comment Systems Living in the Same File

Excel’s comment story changed in 2018. The legacy “notes” (the yellow stickies attached to a single cell, addressed to nobody, never threaded) still ship for backwards compatibility, but the default in Microsoft 365 is now a thread-aware comment system designed for real-time co-authoring. The two systems coexist in the same workbook, are stored in different folders, and have completely different metadata implications.

FeatureLegacy notesThreaded comments
XML locationxl/comments1.xmlxl/threadedComments/threadedComment1.xml
Author identityFree-text string in authors/authorPersona ID resolved through xl/persons/person.xml
ThreadingNoParent / reply tree by GUID
Mentions (@)NoYes — references additional persona IDs
Resolved stateNoYes — done="1" attribute on the root comment
Hidden from default UI when resolvedN/AYes — user must explicitly show resolved threads
Document Inspector reachRemoved by “Comments and annotations”Removed unevenly — persons.xml frequently survives

The mismatch matters because authors who use legacy notes assume their commenter identity is a single editable string, while threaded comments embed a structurally richer identity that points outwards into Microsoft 365 and Active Directory. Strip the visible display name and you have not removed the link.

A Third Layer Older Workbooks Carry

Workbooks created before threaded comments shipped occasionally carry a transitional layer in xl/commentsExt.xml that pre-dates the formal threadedComments folder. It is structurally similar but uses different namespace URIs. Treat any commentsExt file as part of the same audit surface.

Where Threaded Comments Live in the XLSX

An XLSX is a ZIP archive. Threaded comments and personas live in two parallel folders, both wired to worksheets and to each other through OPC relationships.

// XLSX layout fragment showing threaded comment storage

workbook.xlsx (zip)

├── [Content_Types].xml // declares threadedComment + person content types

├── xl/

│ ├── workbook.xml

│ ├── persons/

│ │ └── person.xml // <-- persona directory

│ ├── threadedComments/

│ │ ├── threadedComment1.xml // <-- per-sheet threads

│ │ └── threadedComment2.xml

│ ├── comments1.xml // legacy notes (often a stub)

│ └── worksheets/

│ ├── sheet1.xml

│ └── _rels/

│ └── sheet1.xml.rels // links sheet1 to threadedComment1.xml

The wiring is deliberately layered. Each worksheet’s _rels/sheetN.xml.rels points to a threadedComment part. That part references persona IDs, which in turn resolve through xl/persons/person.xml. Deleting the threaded comment file alone leaves orphaned personas behind; deleting the persons file alone leaves the threaded comments referring to phantom IDs that Excel will silently render as “Unknown User” while Excel itself still knows the original IDs sat in the file at some point.

The Persona Directory: What Is in person.xml

xl/persons/person.xml is the single most identity-rich file most XLSX workbooks contain. Each <person> element carries a display name, a persona provider, a stable persona ID, and frequently an email-shaped userId that Excel uses to look the user up in Microsoft Graph.

// xl/persons/person.xml

<personList xmlns="http://schemas.microsoft.com/office/spreadsheetml/2018/threadedcomments">

<person

displayName="Alice Chen"

id="{5f3c8b81-3f17-4a8a-91b8-4c3b71a1e3d9}"

userId="alice.chen@contoso.com"

providerId="AD"/>

<person

displayName="Bob Marlow"

id="{9d8a4f1a-2c5e-4e63-b12c-7f0b2c8c8a13}"

userId="S-1-5-21-3623811015-3361044348-30300820-2113"

providerId="AD"/>

AttributeWhat it carriesWhat it reveals
displayNameUser-facing labelFull name as it appears in the directory or local profile.
idPer-document persona GUIDA stable identifier used to wire comments to authors. Cross-document correlation by GUID frequently works for the same user across workbooks created in the same session.
userIdProvider-specific user identifierUPN/email for AAD users, NT-style SID for AD users, an SMTP address for ad-hoc people-picker entries, or an objectId GUID for Graph users. The most directly identifying field in the entire workbook.
providerIdIdentity providerCommon values: AD (Azure AD/AD), PeoplePicker, None. Lets a reader infer whether the author signed in to Microsoft 365 or simply typed a name into a local Office install.

userId Is Often a Microsoft 365 Tenant Beacon

When the provider is AD and the user signed in to Microsoft 365, userId is the user’s UPN: alice.chen@contoso.com. The domain part is the tenant’s primary domain. A workbook stripped of every creator and lastModifiedBy tag still names the tenant and the individual user the moment one threaded comment exists. For on-premises AD setups, the same field carries the user’s SID, which uniquely identifies the user against the corporate domain controller.

The threadedComment Element: Threading, Mentions, and Timestamps

Each xl/threadedComments/threadedCommentN.xml stores all the threads attached to one worksheet. A thread is a chain of <threadedComment> elements bound by a parent ID. The root comment has no parent; each reply names the thread root in its parentId attribute.

// xl/threadedComments/threadedComment1.xml

<ThreadedComments xmlns="http://schemas.microsoft.com/office/spreadsheetml/2018/threadedcomments">

<threadedComment

ref="C7"

dT="2026-04-19T14:32:11.45"

personId="{5f3c8b81-3f17-4a8a-91b8-4c3b71a1e3d9}"

id="{aa11bb22-cc33-44dd-ee55-ff6677889900}"

done="1">

<text>Are we still using the Q3 forecast assumptions here?</text>

</threadedComment>

<threadedComment

ref="C7"

dT="2026-04-19T14:41:02.10"

personId="{9d8a4f1a-2c5e-4e63-b12c-7f0b2c8c8a13}"

parentId="{aa11bb22-cc33-44dd-ee55-ff6677889900}"

id="{bb22cc33-dd44-55ee-ff66-001122334455}">

<text><mentions><mention mentionpersonId="{cc33dd44-ee55-66ff-0011-223344556677}"

mentionId="{dd44ee55-ff66-7700-1122-334455667788}" startIndex="0" length="14"/></mentions>

@Carlos Iglesias confirmed the cost basis was updated yesterday.</text>

</threadedComment>

The fields are individually small, but together they reconstruct a surprisingly complete audit trail of the conversation that produced the file.

AttributeMeaning
refA1-style cell reference the thread is attached to. Reveals which cells attracted discussion and, by extension, which numbers were uncertain.
dTUTC timestamp to hundredths of a second. Far more precise than the workbook’s coarse core.xml timestamps, and not a value the Document Inspector touches.
personIdForeign key into person.xml. The actual identity is one indirection away.
idPer-comment GUID. Stable across saves, used to anchor replies.
parentIdGUID of the root comment in the thread. Lets you reconstruct the entire reply tree.
doneSet to 1 when the user marks the thread resolved. The thread disappears from the default UI but the XML is preserved.
mentions/mentionEach @-mention names the persona it points at, plus a span (startIndex, length) inside the comment text.

Mentions Build a Hidden Org Chart

The <mentions> child of every threaded comment is the part most authors do not realise persists. When a user types @Carlos into a comment, Excel records:

  • The mentioned person’s persona ID, which is added to person.xml — even if Carlos has never opened the workbook himself.
  • A new persona record for Carlos with his display name, his userId, and provider, sourced from the typing user’s directory lookup.
  • A separate mentionId GUID for the specific occurrence of the mention in the text, allowing Excel to drive notification dispatch and badge UI.
  • The exact character span of the mention in the comment text.

Multiplied across a workbook that has been collaborated on for weeks, the persona list ends up reading like a roll call of the project team — including people who only ever appeared as @-mentions and never typed a word in the file. A reader of the XLSX learns who reviewed the file, who was tagged for follow-up, and which questions were directed at whom. Even after the visible thread is resolved or deleted, the persona records frequently survive.

@-Mentions Persist After Comment Deletion

Deleting a single threaded comment that contained an @-mention removes the comment from the worksheet, but Excel does not garbage-collect the corresponding persona from person.xml. The mentioned colleague’s display name, UPN, and provider continue to ship inside the file. Audit a recently-edited workbook and you will frequently find personas with no remaining comment references — ghosts of conversations that were nominally erased.

Resolved Threads Are Hidden, Not Deleted

Excel’s “Resolve thread” option flips the done attribute on the root comment to 1 and hides the bubble from the default review pane. The XML is otherwise untouched. A reader who unzips the workbook still sees:

  • The full thread text, including every reply.
  • Every commenter and mentioned persona.
  • Per-reply timestamps with hundredth-of-a-second precision, allowing reconstruction of how long the discussion took to resolve.
  • The done="1" flag itself, which signals to a forensic reader that the conversation reached closure — useful intelligence in disputed-document scenarios.

Resolving a thread is a UI convenience, not a metadata-removal step. For workbooks heading outside the company, every resolved thread is a piece of internal deliberation hiding behind a Show resolved comments checkbox in Excel.

What an External Reader Reconstructs

A motivated reader who unzips an XLSX with active threaded comments can reconstruct, with no special tooling, a very detailed picture of the workbook’s authoring history.

  • A list of every collaborator and mentioned colleague, with their UPNs (which encode the tenant domain) or AD SIDs.
  • The corporate Microsoft 365 tenant. A single @contoso.com userId pinpoints the tenant and lets the reader probe public Graph endpoints for tenant ID, default domain, and federation status.
  • Approximate seniority signals. Display names with titles like “Dr.”, “CFO”, “General Counsel”, or naming patterns like Last, First (External) reveal organisational role.
  • A timeline of when the workbook was actively edited. The cluster of dT timestamps maps directly to working hours, time zones, and quiet windows.
  • The cells that received scrutiny. A reader sees exactly which numbers attracted comments — usually the most uncertain or politically sensitive figures in the model.
  • The actual deliberation. Every comment text, including resolved ones, is plain UTF-8 inside the XML. Forensic readers extract the entire stream with one shell pipeline.
  • External vs internal mentions. Mentioned personas with providerId="PeoplePicker" and free-text email addresses indicate external collaborators that the org may not have realised were inside the workbook’s metadata.

Reading Threaded Comments Without Excel

Three quick recipes pull the comment and persona data out of a workbook without ever launching Office.

# 1. List every threaded comment and persona file

unzip -l workbook.xlsx | grep -E "persons|threadedComments"

# 2. Dump the persona directory

unzip -p workbook.xlsx "xl/persons/person.xml"

# 3. Pull every comment text plus its timestamp

unzip -p workbook.xlsx "xl/threadedComments/threadedComment1.xml" | \

xmlstarlet sel -t -m "//*[local-name()='threadedComment']" -v "@dT" -o " | " -v "." -n

For programmatic auditing, a small Python script joins personas to comments and prints the reconstructed conversation:

import zipfile

from xml.etree import ElementTree as ET

NS = "{http://schemas.microsoft.com/office/spreadsheetml/2018/threadedcomments}"

with zipfile.ZipFile("workbook.xlsx") as z:

people = {}

if "xl/persons/person.xml" in z.namelist():

root = ET.fromstring(z.read("xl/persons/person.xml"))

for p in root.iter(NS + "person"):

people[p.get("id")] = (p.get("displayName"),

p.get("userId"), p.get("providerId"))

for name in z.namelist():

if name.startswith("xl/threadedComments/"):

root = ET.fromstring(z.read(name))

for c in root.iter(NS + "threadedComment"):

person = people.get(c.get("personId"), ("?", "?", "?"))

text = (c.findtext(NS + "text") or "").strip()

print(c.get("dT"), person, c.get("ref"), text)

The output is a flat log of every threaded comment ever made in the workbook, with the full identity of every commenter, the cell they targeted, and the text they wrote — including the resolved threads that the Excel UI hides by default.

Document Inspector Coverage Is Inconsistent

Document Inspector exposes a single “Comments and annotations” toggle. In practice its behaviour against threaded comments has shifted across Office builds and is inconsistent enough that you should not rely on it as the last line of defence.

  • Comment text and threadedComment XML files are usually removed, when the toggle is run on a recent Microsoft 365 build with the workbook closed everywhere else.
  • Resolved (done="1") threads are sometimes left behind in older Office builds because the inspector enumerates only the visible comments.
  • xl/persons/person.xml frequently survives the cleanup, with the directory of every persona who ever participated or was mentioned still inside the file. The persona list is then orphaned but readable.
  • Mentions in still-open comment threads can re-create persona records on the next save, even after Document Inspector has been run, if the workbook is then edited.
  • The legacy comments1.xml file is removed by the same toggle, which can leave a dangling relationship in [Content_Types].xml that some downstream tools warn on.

How to Actually Strip the Threaded Comment Layer

A clean removal touches four artefacts: the threaded comment files, the persona directory, the relationship entries that point at them, and the content-type overrides. Four approaches in increasing order of robustness.

1. Resolve, then delete every thread inside Excel

The Review > Comments pane has a “Show resolved comments” toggle and a delete option for each thread. Manually resolve everything, show resolved, delete each one, then save. This removes the visible threads but is slow and frequently leaves orphaned persona records, especially around @-mentions.

2. Run Document Inspector and verify after

File > Info > Check for Issues > Inspect Document, then tick “Comments and annotations” and run. Verify by unzipping the saved file and confirming both xl/threadedComments/ and xl/persons/ are gone. If the persons file remains, fall back to one of the next options.

3. Strip programmatically

A short Python script removes every threaded comment artefact in a single pass and rewrites the relationship and content-type files:

import zipfile, re

from pathlib import Path

src, dst = Path("in.xlsx"), Path("out.xlsx")

DROP = ("xl/threadedComments/", "xl/persons/", "xl/commentsExt.xml")

PATTERN = rb"<(?:Override|Relationship)[^/]*(?:threadedComment|person|commentsExt)[^/]*/>"

with zipfile.ZipFile(src) as zin, \

zipfile.ZipFile(dst, "w", zipfile.ZIP_DEFLATED) as zout:

for item in zin.infolist():

if any(item.filename.startswith(p) for p in DROP):

continue

data = zin.read(item.filename)

if item.filename.endswith((".rels", "[Content_Types].xml")):

data = re.sub(PATTERN, b"", data)

zout.writestr(item, data)

Excel reopens the resulting file cleanly. Because no worksheet XML element references the threaded-comment relationship by ID, no further fix-up is required.

4. Pre-share pipeline with persona reconciliation

For organisations that share workbooks regularly, build a server-side step into the file-sharing pipeline that strips both xl/threadedComments/ and xl/persons/, then runs a final scan to confirm no residual personId attributes remain in worksheet XML. This composes well with the parallel sanitisation steps for external links, defined names, and printer settings — the same pipeline can clean all four layers in one pass.

Pre-Share Checklist

Run this checklist against any workbook leaving your organisation, especially if it has been collaborated on through Microsoft 365.

  • Have I confirmed whether xl/threadedComments/ and xl/persons/ exist in the ZIP?
  • For every persona, have I extracted displayName, userId, and providerId and verified none point to colleagues, mentions, or external email addresses that should not leave the organisation?
  • Does any userId contain a UPN that names the corporate Microsoft 365 tenant domain, or an AD SID that could be correlated against the directory?
  • Have I enumerated every threaded comment with done="1" and confirmed the resolved-thread text contains nothing internal that I do not want a recipient to read?
  • Have I scanned the comment timestamps (dT) for working-hours patterns or after-hours edits that I would not want to disclose?
  • Have I cross-referenced personas with mentions and confirmed no orphan persona records exist for colleagues that were @-mentioned in deleted comments?
  • For workbooks shipped externally as final deliverables, have I removed both xl/threadedComments/ and xl/persons/ entirely, plus their .rels and [Content_Types].xml entries?
  • Have I verified that the legacy xl/comments1.xml file does not contain residual notes that the threaded-comment cleanup ignored?
  • Have I cross-checked the rest of the metadata layers — defined names, external links, printer settings, calculation chain — that the same workbook may be carrying invisibly?

Conclusion

Threaded comments and the persona directory are the most identity-rich metadata layer the modern XLSX format introduces. Designed for real-time co-authoring, they bind every comment to a Microsoft 365 or Active Directory persona that includes a stable user identifier, a tenant-naming domain, and a display name — plus a parallel @-mention graph that pulls in colleagues who never typed a single character into the file. The Document Inspector handles the visible threads inconsistently and frequently leaves the persona directory behind, where it sits as a complete name-and-tenant beacon long after the conversation it documented has been resolved or deleted.

For workbooks staying inside an organisation, the layer is exactly what it claims to be: a useful collaboration record. For workbooks crossing the perimeter, it is a tenant fingerprint, an org-chart fragment, and a transcript of internal deliberation, all in one. The only durable defence is to strip xl/threadedComments/ and xl/persons/ entirely, along with the relationship and content-type entries that point at them — a small ZIP-level operation that closes a leak most security teams never inspect.

Audit Threaded Comments and Personas in Your Workbooks

Use MetaData Analyzer to enumerate every threaded comment, decode the persona directory, surface tenant-revealing UPNs and AD identifiers, and confirm both layers are gone before your workbooks leave the organisation.