Back to Blog
Forensics

How to Verify Document Authenticity Using Metadata

When you receive an Excel file claiming to be an original financial report, a contract appendix, or an audit workpaper, how do you know it is genuine? Metadata analysis provides a systematic, evidence-based approach to verifying document authenticity—examining internal properties, structural artifacts, application fingerprints, and temporal consistency to determine whether a file is what it claims to be or has been fabricated, altered, or misrepresented.

By Forensics TeamFebruary 24, 202621 min read

Why Document Authenticity Matters

Document authenticity verification is a cornerstone of digital forensics, legal proceedings, regulatory compliance, and business due diligence. In a world where anyone can create, copy, and modify Excel files with ease, the ability to determine whether a document is genuine—and has not been tampered with—is essential. Metadata provides the forensic fingerprints that make this possible.

Unlike the visible content of a spreadsheet, which can be altered freely, metadata is written automatically by the application, the operating system, and the file format specification. These metadata layers create an internal consistency that is extremely difficult to fabricate convincingly. A forger may change cell values and formatting, but matching every metadata field, XML namespace, ZIP archive timestamp, internal relationship record, and application version signature is a far more demanding task—and usually one where mistakes reveal the truth.

What Authenticity Verification Can Determine

  • Genuine origin: Whether a file was created by the claimed author using the claimed application and version
  • Temporal integrity: Whether creation and modification dates are consistent with the claimed document history
  • Content integrity: Whether the file contents have been altered after the last legitimate save
  • Provenance chain: Whether the document has passed through the claimed sequence of editors and systems
  • Format consistency: Whether the internal file structure matches what the claimed application would produce
  • Fabrication indicators: Whether the file was created from scratch to look like an older or different document
  • Template detection: Whether a document was derived from another file rather than created independently

Step 1: Examine Core Document Properties

The first layer of authenticity verification begins with the document's core properties stored in docProps/core.xml. These fields record the claimed author, creation date, last editor, and modification date. While these are the easiest properties to manipulate, they also provide the baseline claims that all subsequent checks will validate against.

Author and Editor Verification

The dc:creator and cp:lastModifiedBy fields identify the original author and most recent editor. These should match known user identities within the organization. Inconsistencies here are often the first sign that something is wrong.

What to Check

  • • Does the creator name match a real person in the organization?
  • • Does the name format match the organization's naming convention (e.g., "John Smith" vs "jsmith" vs "DOMAIN\jsmith")?
  • • Is the lastModifiedBy the same as the creator, or does it show the expected chain of editors?
  • • Are there generic names like "User", "Admin", or blank fields that suggest a non-standard environment?

Red Flags

  • • Creator name doesn't exist in company records
  • • Name format differs from the organization's standard Office installation
  • • Creator and lastModifiedBy are identical on a document that supposedly went through review
  • • Name encoding issues (e.g., UTF-8 vs Latin-1 mismatch) that suggest cross-system copying

# Extract and examine core properties

unzip -o document.xlsx -d doc_extract/

cat doc_extract/docProps/core.xml

 

# Example output revealing a mismatch:

# <dc:creator>Sarah Johnson</dc:creator>

# <cp:lastModifiedBy>User</cp:lastModifiedBy>

# The generic "User" as last editor is suspicious

# when the claimed workflow involves named reviewers

Timestamp Consistency Check

The creation and modification timestamps must be logically consistent with each other and with the document's claimed history. These timestamps are stored in ISO 8601 format and should always have the creation date before the modification date.

Logical Consistency Rules

  • Created ≤ Modified: The creation date must always precede or equal the last modification date. A violation is a definitive sign of manipulation.
  • Timezone coherence: Both timestamps should use the same timezone offset. Mixed offsets suggest the file was modified on a system in a different timezone.
  • Business hours plausibility: Were the creation and modification times during reasonable working hours for the claimed author's timezone?
  • Weekend and holiday check: Documents claiming creation on weekends or company holidays warrant additional scrutiny.
  • Precision consistency: Both timestamps should have the same level of precision (seconds vs. no seconds). Mixed precision suggests different writing mechanisms.

# Parse and compare timestamps

grep -oP "(?<=<dcterms:created[^>]*>).*?(?=</dcterms:created>)" \

  doc_extract/docProps/core.xml

 

grep -oP "(?<=<dcterms:modified[^>]*>).*?(?=</dcterms:modified>)" \

  doc_extract/docProps/core.xml

 

# Suspicious example:

# Created: 2025-03-15T08:30:00Z

# Modified: 2025-03-15T08:30:00Z

# Identical timestamps on a complex workbook suggest

# it was saved exactly once — unusual for a document

# that claims multiple rounds of review

Step 2: Verify Application Fingerprints

Every application that creates or modifies an Excel file leaves a distinctive fingerprint in the file's metadata. The extended properties in docProps/app.xml record the application name, version, and other characteristics that can be used to confirm or contradict claims about how the file was created. This is one of the most powerful authenticity checks because forgers rarely think to match application signatures.

Application Version Analysis

The Application and AppVersion fields identify the software that last saved the file. Each version of Excel produces a specific version string, and the format of the generated XML varies slightly between versions. Mismatches between the claimed creation environment and the actual application fingerprint are strong evidence of inauthenticity.

Known Excel Version Signatures

AppVersionExcel VersionRelease Year
12.0000Excel 20072007
14.0300Excel 20102010
15.0300Excel 20132013
16.0300Excel 2016/2019/3652016+
16.0300Excel 365 (current)2020+

# Check application fingerprint

cat doc_extract/docProps/app.xml

 

# Look for key fields:

# <Application>Microsoft Excel</Application>

# <AppVersion>16.0300</AppVersion>

# <TotalTime>487</TotalTime>

 

# A file claiming to be from 2009 but showing

# AppVersion 16.0300 was actually last saved

# with Excel 2016 or later

Editing Time Validation

The TotalTime property in app.xml records the cumulative minutes the document was open for editing. This value should be proportional to the document's complexity and claimed history. It is one of the hardest properties for forgers to set correctly because it requires understanding the realistic editing time for the document in question.

Plausibility Checks

  • • A complex 50-sheet workbook with 3 minutes of editing time was likely generated programmatically
  • • Editing time of 0 indicates the file was never opened in Excel after creation
  • • TotalTime exceeding the span between created and modified dates (in minutes) is impossible
  • • Round numbers (exactly 60, 120, 300) suggest manual manipulation

Cross-Reference Technique

  • • Compare TotalTime against the number of cells with data
  • • Estimate minimum typing time based on content volume
  • • Factor in formula complexity—complex formulas take longer
  • • Consider formatting effort for heavily styled workbooks

Real-World Example

An employee submitted a quarterly financial report claiming it had been developed over three weeks. Metadata analysis revealed a TotalTime of 7 minutes and an AppVersion corresponding to LibreOffice Calc rather than the company's standard Microsoft Excel installation. Further investigation showed the file was generated from a Python script and manually saved once to add superficial metadata, but the editing time and application fingerprint betrayed its true origin.

Non-Excel Application Detection

Files created by third-party libraries (openpyxl, Apache POI, SheetJS) or non-Excel applications (Google Sheets, LibreOffice Calc) have distinctive signatures that differ from genuine Microsoft Excel output. Detecting these signatures is critical when a document claims to have been created in Excel.

Common Third-Party Signatures

  • openpyxl (Python): Sets Application to "Microsoft Excel" but uses different XML formatting, namespace prefixes, and element ordering than real Excel
  • Apache POI (Java): May include Application as "Apache POI" or mimic Excel; check for non-standard XML comments or processing instructions
  • Google Sheets: Exported XLSX files often lack app.xml entirely or include Google-specific custom properties
  • LibreOffice Calc: Uses different default fonts, style definitions, and may include meta:generator tags identifying the LibreOffice version
  • SheetJS: Produces minimal XML with fewer default styles and often lacks chart sheets, pivot cache, or other complex structures

# Check for non-standard content types

cat doc_extract/[Content_Types].xml | xmllint --format -

 

# Check for Google Sheets signature

grep -r "google\|Sheets\|spreadsheets.google" doc_extract/

 

# Check for LibreOffice signature

grep -r "LibreOffice\|meta:generator" doc_extract/

 

# Check for openpyxl/POI artifacts

grep -r "openpyxl\|Apache POI" doc_extract/

Step 3: Analyze XML Structure Consistency

Every version of Excel produces XLSX files with characteristic XML patterns—specific namespace declarations, element ordering, default style definitions, and relationship structures. These patterns form a structural fingerprint that is extremely difficult to replicate perfectly. Analyzing these structural elements is one of the most reliable methods for detecting fabricated documents.

Namespace and Schema Validation

XLSX files declare XML namespaces that correspond to specific Office Open XML schema versions. These namespaces must be consistent across all XML files within the archive. Mismatched namespaces are a strong indicator that files were manually assembled from different sources.

Key Namespace Checks

  • Spreadsheet ML namespace: Should be consistent across workbook.xml, all sheet files, and shared strings
  • Relationship namespace: All .rels files should use the same relationship schema URI
  • Extended properties namespace: Must match the Office version indicated by AppVersion
  • Transitional vs. Strict: The document should consistently use either transitional or strict OOXML, not a mixture

# Extract all namespace declarations across the archive

grep -rh "xmlns" doc_extract/ | sort -u

 

# Check for mixed transitional/strict namespaces

# Transitional uses: schemas.openxmlformats.org

# Strict uses: purl.oclc.org/ooxml/

grep -r "purl.oclc.org" doc_extract/

grep -r "schemas.openxmlformats.org" doc_extract/

 

# A genuine file uses one or the other consistently

# Finding both is a strong fabrication indicator

Relationship File Integrity

XLSX files use relationship files (.rels) to define how internal components connect to each other. Every sheet, style definition, shared string table, and embedded object must have a corresponding relationship entry. Missing, orphaned, or inconsistent relationships reveal manual file manipulation.

Integrity Checks

  • • Every target in .rels must point to an existing file
  • • Every XML file must be referenced by at least one .rels entry
  • • Relationship IDs (rId1, rId2...) should follow a sequential pattern
  • • Gaps in rId sequences suggest deleted components

Fabrication Indicators

  • • Relationship targets pointing to non-existent files
  • • Files present in the archive but not referenced
  • • Duplicate relationship IDs for different targets
  • • Relationship types that don't match the target file content

# List all relationship files

find doc_extract/ -name "*.rels" -exec echo "=== ===" \; -exec cat \;

 

# Cross-reference: list all files in the archive

find doc_extract/ -type f | sort

 

# Check for orphaned files not in any .rels

# or .rels targets that don't exist on disk

# Mismatches indicate manual file assembly

Shared String Table Analysis

The shared string table (xl/sharedStrings.xml) stores all unique text strings used in the workbook. Excel adds strings to this table in the order they are first entered, creating a chronological record of data entry. The count attribute (total string references) and uniqueCount attribute (unique strings) should match the actual cell content of the workbook.

Verification Technique

  • Count verification: Sum all string references in sheet XML files and compare against the count attribute. Mismatches indicate post-processing
  • Unique count check: Count distinct <si> elements and compare against uniqueCount. A discrepancy means the file was manually edited
  • String ordering: Headers and titles typically appear first in the table. If data strings precede structural strings, the sheet may have been restructured
  • Orphaned strings: Strings in the table but not referenced by any cell indicate deleted content or copy-paste from another source

# Check shared string table attributes

head -5 doc_extract/xl/sharedStrings.xml

 

# Example: <sst count="1847" uniqueCount="523">

 

# Count actual <si> elements

grep -c "<si>\|<si " doc_extract/xl/sharedStrings.xml

 

# If uniqueCount says 523 but there are 519 <si>

# elements, the file has been manually edited after

# the shared string table was generated

Step 4: Inspect ZIP Archive Artifacts

XLSX files are ZIP archives containing the XML files and other resources that compose the workbook. The ZIP format itself contains metadata—file modification timestamps, compression methods, and internal ordering—that provides an independent layer of forensic evidence. Because most forgers focus on the XML content, ZIP-level metadata often preserves the truth about a file's actual history.

ZIP Entry Timestamps

Each file within a ZIP archive has its own modification timestamp. When Excel saves a workbook, it writes all internal files with timestamps that should be close to the save time. Inconsistencies in these timestamps reveal manual archive assembly.

What to Look For

  • Uniform timestamps: All entries in a genuine Excel-saved file should have nearly identical timestamps (within seconds)
  • Mixed timestamps: If some entries show 2024 dates and others show 2025 dates, the archive was assembled from files created at different times
  • Timestamp vs. core.xml: ZIP entry timestamps should closely match the dcterms:modified value from core.xml
  • DOS time precision: ZIP timestamps use DOS time format (2-second precision). Timestamps with odd seconds may have been set by non-standard tools

# List all ZIP entries with timestamps

unzip -l document.xlsx

 

# Detailed view including compression method

zipinfo document.xlsx

 

# Example output showing suspicious mixed dates:

# Length Method Size Cmpr Date Time Name

# -------- ------ ------- ---- ---------- ----- ----

# 1368 Defl:N 432 68% 2025-03-15 08:30 [Content_Types].xml

# 588 Defl:N 243 59% 2025-03-15 08:30 _rels/.rels

# 23847 Defl:N 5692 76% 2024-11-02 14:22 xl/worksheets/sheet1.xml

# ^^^^^^^^^

# This sheet has a timestamp 4 months older than

# the rest of the archive — it was likely copied

# from a different file and inserted manually

Compression Method Consistency

Excel uses specific compression settings when creating XLSX files. The compression method (Deflate) and compression level should be consistent across all entries. Different compression methods or levels for different files within the archive suggest that the archive was assembled using a tool other than Excel.

Excel Defaults

  • • Compression method: Deflate for all XML files
  • • Images may use Store (no compression) since they are already compressed
  • • Consistent compression ratio across similar file types
  • • Standard ZIP local file headers

Anomaly Indicators

  • • Mixed compression methods (Deflate and Store) for XML files
  • • Unusually high or low compression ratios
  • • ZIP64 extensions on small files (suggests non-Excel tool)
  • • Non-standard extra field data in ZIP headers

Step 5: Examine Style and Formatting Evidence

The styles file (xl/styles.xml) contains every cell format, number format, font definition, fill pattern, and border style used in the workbook. Excel generates this file with a set of default styles that varies by version and locale. The presence, absence, and ordering of these styles provides another layer of authenticity evidence.

Default Style Verification

Every version of Excel includes a specific set of built-in styles (Normal, Comma, Currency, Percent, etc.) with version-specific default fonts and formatting. These defaults serve as a version fingerprint that should match the AppVersion claim.

Version-Specific Style Indicators

  • Default font: Excel 2007-2010 defaults to Calibri 11pt; older versions used Arial 10pt. The default font in styles.xml should match the claimed version.
  • Built-in number formats: Excel defines number format IDs 0-163 internally. Custom formats start at 164+. The presence of non-standard built-in IDs suggests a non-Excel generator.
  • Color theme: Excel 2013+ uses a different default color theme than 2007/2010. Theme colors in styles should match the version era.
  • Cell style count: The number of built-in cell styles varies by version. A mismatch between the style count and AppVersion is suspicious.

# Check default font in styles

grep -i "font" doc_extract/xl/styles.xml | head -10

 

# Count number formats

grep -c "numFmt" doc_extract/xl/styles.xml

 

# Check for theme reference

cat doc_extract/xl/theme/theme1.xml | head -20

 

# A file claiming Excel 2007 origin but containing

# Excel 2016 theme colors was not created when claimed

Unused Style Detection

When cells are deleted or reformatted, their style definitions often remain in the styles file. These orphaned styles can reveal the history of editing and indicate whether content was imported from other workbooks.

What Orphaned Styles Reveal

  • Cross-workbook copying: If a workbook about sales data contains styles for scientific notation, engineering units, or medical codes, content may have been pasted from a different workbook
  • Deleted content: Currency formats for EUR in a USD-only workbook suggest deleted international data
  • Template origins: Professional templates include distinctive custom styles. Finding these styles in a document that claims to be original reveals its true origin
  • Multiple sources: An excessive number of similar-but-slightly-different styles (e.g., five variations of "Calibri 11pt") suggests content merged from multiple sources

Step 6: Cross-Reference Multiple Evidence Layers

The most powerful authenticity verification comes from cross-referencing evidence across all the layers examined above. No single metadata field proves or disproves authenticity on its own, but inconsistencies between independent evidence sources create a compelling case. This is where forensic analysis becomes truly effective.

The Cross-Reference Matrix

Build a verification matrix that compares claims across all metadata layers. Each row represents a factual claim about the document, and each column represents an independent source of evidence. Contradictions between sources are highlighted as anomalies.

Claimcore.xmlapp.xmlZIP HeadersStructural
Created March 20252025-03-15TotalTime: 4872026-01-08CONFLICT
Created in Excel 2016N/AAppVersion: 16.0300N/A2016+ themes
Author: Sarah Johnsondc:creator matchN/AN/ADefault printer: Home_HP
Reviewed by 3 peoplelastModifiedBy: UserTotalTime: 487 minSingle timestampCONFLICT

Systematic Verification Workflow

Follow this structured workflow to systematically verify document authenticity using all available metadata evidence.

1

Extract and Preserve

Create a forensic copy of the file. Extract the ZIP archive to a working directory. Record file hashes (SHA-256) before any analysis to establish evidence integrity.

2

Document the Claims

Record what the document claims to be: who created it, when, using what software, and what its revision history should look like. These are the assertions you will test.

3

Extract All Metadata Layers

Examine core properties, extended properties, ZIP archive metadata, XML structure, shared strings, and styles. Record all findings in a structured format.

4

Cross-Reference and Identify Conflicts

Build the verification matrix. Compare each claim against every evidence source. Flag all inconsistencies, no matter how minor. A pattern of small inconsistencies is more significant than any single anomaly.

5

Assess and Report

Classify the document as authentic, questionable, or fabricated based on the weight of evidence. Document your methodology, findings, and conclusions in a forensic report that can withstand scrutiny.

Advanced Verification Techniques

Beyond the standard verification steps, several advanced techniques can provide additional evidence when standard checks are inconclusive or when dealing with sophisticated forgeries.

Reference Document Comparison

One of the most effective advanced techniques is comparing the suspect document against a known-authentic reference document from the same environment. If the claimed author creates documents regularly on the same system, comparing metadata patterns between the suspect file and a verified file can reveal inconsistencies that are invisible in isolation.

What to Compare

  • Author name format: Does the creator name match exactly, including capitalization, spacing, and encoding?
  • Default printer: Excel stores the last-used printer name. The suspect file should reference the same printer as other files from the same workstation.
  • Custom properties: Some organizations deploy group policy settings that add custom document properties. Genuine files from the organization should all contain these properties.
  • XML formatting patterns: Excel's XML serialization has subtle version-specific patterns (whitespace, attribute ordering, self-closing tags). Files from the same Excel installation should match.
  • Theme and style definitions: If the organization uses a custom Office theme, the theme XML should be identical across genuine documents.

# Compare core properties between suspect and reference

diff <(cat suspect_extract/docProps/core.xml) \

     <(cat reference_extract/docProps/core.xml)

 

# Compare style definitions

diff <(cat suspect_extract/xl/styles.xml) \

     <(cat reference_extract/xl/styles.xml)

 

# Compare theme files

diff <(cat suspect_extract/xl/theme/theme1.xml) \

     <(cat reference_extract/xl/theme/theme1.xml)

 

# Identical themes confirm same environment;

# different themes suggest different origin

Calculation Chain Analysis

The calculation chain file (xl/calcChain.xml) records the order in which Excel evaluates formulas. This chain is built incrementally as formulas are added and provides a hidden chronological record of formula creation. Its presence and structure can help verify that a complex workbook was built over time rather than generated all at once.

Calculation Chain Indicators

  • Missing calcChain.xml: A workbook with formulas but no calculation chain was likely generated programmatically. Excel always creates this file when formulas are present.
  • Chain ordering: In a naturally built workbook, the calculation chain follows the logical order of formula dependencies. A scrambled chain suggests automated generation.
  • Sheet references: The calculation chain references sheets by their internal IDs. These IDs should match the sheet definitions in workbook.xml.
  • Formula count: The number of entries in the calculation chain should match the total formula count across all sheets.

# Check if calculation chain exists

ls -la doc_extract/xl/calcChain.xml

 

# View calculation chain structure

cat doc_extract/xl/calcChain.xml

 

# Count formula cells in sheet vs. chain entries

grep -c "<f>\|<f " doc_extract/xl/worksheets/sheet1.xml

grep -c "<c " doc_extract/xl/calcChain.xml

 

# Mismatched counts indicate post-generation editing

External Link and Connection Analysis

Excel workbooks often contain references to external files, data connections, and linked objects. These references embed file paths, server names, and network locations that reveal the environment where the document was actually used—regardless of what the core properties claim.

Environmental Evidence in Links

  • File paths: External references contain full paths like C:\Users\RealAuthor\Documents\source.xlsx that reveal the actual user and system
  • Network shares: UNC paths like \\server\share\folder identify the network environment where the file was used
  • Data connections: ODBC and OLE DB connection strings contain server names, database names, and sometimes credentials
  • Embedded objects: OLE objects carry their own metadata, including the application and file path of the original object

# Search for external references

grep -r "externalLink\|connection\|oleObject" doc_extract/

 

# Look for file paths in sheet XML

grep -r "C:\\\|/Users/\|\\\\\\\\" doc_extract/xl/

 

# Check for data connection files

ls doc_extract/xl/connections.xml 2>/dev/null

ls doc_extract/xl/externalLinks/ 2>/dev/null

 

# A document supposedly created by "Sarah Johnson"

# but with external links referencing

# C:\Users\MikeW\Desktop\ reveals the true author

Common Forgery Patterns and How to Detect Them

Understanding the common patterns used to forge or misrepresent Excel documents helps investigators know what to look for. Each pattern has characteristic metadata signatures that betray the forgery.

Pattern 1: Backdated Documents

A document is created today but the creation date is changed to make it appear older. This is the most common forgery pattern, used to fabricate evidence of prior knowledge, meet deadlines retroactively, or establish false timelines.

Detection Signatures

  • • ZIP archive timestamps don't match core.xml creation date
  • • AppVersion indicates a version of Excel that wasn't available at the claimed creation date
  • • XML namespaces use schemas that were published after the claimed creation date
  • • Default theme colors match a version of Office released after the claimed date
  • • TotalTime is too low for the claimed document lifespan
  • • Shared string table contains references to events or terms that didn't exist at the claimed creation date

Pattern 2: Author Substitution

A document created by one person has its author metadata changed to attribute it to someone else. This is used to fabricate work product, avoid accountability, or create false evidence of authorship.

Detection Signatures

  • • Author name format doesn't match the claimed author's known Office configuration
  • • External links reference a different user's directory
  • • Printer name references a printer not associated with the claimed author's workspace
  • • VBA project properties (if macros exist) contain a different author or computer name
  • • Custom document properties contain organizational identifiers from a different department or company

Pattern 3: Programmatic Generation Disguised as Manual Work

A document is generated by a script or application but modified to appear as if it was created manually in Excel. This pattern is used to fabricate financial reports, audit evidence, and compliance documentation.

Detection Signatures

  • • TotalTime is zero or very low relative to document complexity
  • • Missing calculation chain despite containing formulas
  • • Minimal or non-standard style definitions
  • • XML formatting differs from genuine Excel output (element ordering, whitespace, namespace prefixes)
  • • Perfectly sequential shared string ordering (no reordering from edits)
  • • No print area or page setup definitions that would be present in a document prepared for review
  • • Cell values that are numerically perfect (no rounding artifacts) suggesting direct data injection rather than formula calculation

Pattern 4: Selective Content Alteration

An authentic document is modified to change specific values while preserving the overall appearance of authenticity. This is the hardest pattern to detect because most metadata remains genuine.

Detection Signatures

  • • Shared string count/uniqueCount mismatches from direct XML editing
  • • Cell style references that don't match the applied formatting (if cell XML was edited directly)
  • • Calculation chain inconsistencies where formula dependencies don't match the current cell values
  • • Modified date is very close to the current date while the document claims a longer history
  • • ZIP archive shows re-compression artifacts (different compression ratio for specific sheets)
  • • Formula results that are inconsistent with current formula definitions (cached values don't match)

Building a Forensic Authenticity Report

When document authenticity verification is performed for legal, regulatory, or corporate governance purposes, the findings must be documented in a structured forensic report that can withstand cross-examination and peer review.

Essential Report Components

1. Evidence Preservation Record

Document the chain of custody: how you received the file, the original file hash (SHA-256), where the working copy is stored, and what tools were used for analysis. This establishes that your analysis was performed on an unmodified copy.

2. Claims Under Test

Explicitly state what the document claims to be: the claimed author, creation date, modification history, application used, and any other assertions. These form the hypotheses that your analysis will test.

3. Methodology

Describe the analysis steps performed, the tools used (including versions), and the order of operations. This allows another examiner to reproduce your findings independently.

4. Findings and Evidence

Present each metadata finding with its source, the expected value, the actual value, and the significance of any discrepancy. Include raw XML excerpts, ZIP listings, and command outputs as supporting evidence.

5. Cross-Reference Matrix

Include the full verification matrix showing how each claim was tested against multiple evidence sources, with clear highlighting of contradictions and anomalies.

6. Conclusion and Confidence Level

State your conclusion about the document's authenticity and assign a confidence level. Use qualified language: "The metadata evidence is consistent with fabrication" rather than "The document is fake." Distinguish between what the evidence shows and what it suggests.

Using MetaData Analyzer for Authenticity Verification

While manual forensic analysis gives you the deepest understanding of file metadata, MetaData Analyzer automates many of these verification steps, providing rapid assessment that can guide deeper investigation.

Automated Verification with MetaData Analyzer

Core Property Extraction

Instantly view author, creation date, modification date, and revision count with visual highlighting of suspicious patterns.

Application Fingerprint Display

See the application name, version, and editing time at a glance. Cross-reference against the claimed creation environment.

Timestamp Consistency Analysis

Automatic comparison of creation, modification, and access timestamps to flag logical impossibilities and suspicious patterns.

Hidden Content Detection

Detect hidden sheets, comments, tracked changes, and embedded objects that may contradict the document's claimed content.

Metadata Removal for Clean Sharing

Once verification is complete, strip all metadata before sharing to prevent exposing your forensic methodology or sensitive investigation details.

Key Takeaways

Multi-Layer Verification

Never rely on a single metadata field. True authenticity verification requires cross-referencing evidence from core properties, application fingerprints, XML structure, ZIP archives, and file content to build a complete picture.

Application Fingerprints Are Hard to Fake

Each Excel version produces distinctive XML patterns, default styles, and structural characteristics. These fingerprints are often overlooked by forgers and provide some of the most reliable authenticity evidence.

Timestamps Exist in Multiple Places

Core properties, ZIP headers, and file system metadata all record timestamps independently. A forger who changes one source rarely changes all three consistently, creating detectable contradictions.

Document Your Methodology

Forensic findings are only valuable if they can withstand scrutiny. Always preserve evidence integrity, document your analysis steps, and present findings with appropriate confidence qualifiers.

Verify Your Document's Authenticity

Use MetaData Analyzer to instantly examine the metadata of any Excel file. Identify author information, timestamps, application fingerprints, and hidden properties that reveal whether a document is genuine.