Real World Impact
/
Data Audit & Data Management
The Impact

This analysis gave the team a clear view of how widespread duplicate company records were inside HubSpot and revealed the data completeness issues contributing to them. With Petavue’s structured process, the findings translate directly into improvements that support revenue teams and operational efficiency.

  • Improved visibility into 2,773 duplicate groups, enabling targeted cleanup efforts.
  • Clear identification of missing fields and inconsistent data entry patterns, paving the way for stronger CRM data integrity.
  • Actionable recommendations that help you get to a more accurate and unified company database.
DATA SOURCE
Salesforce
HubSpot
CATEGORY
RevOps
COMPLEXITY
Moderate
ANALYSIS TIME
Manual:
~1.5 hrs
Petavue
~8 mins
This analysis is inspired by real analyses run by our users. This is a recreated version to illustrate the process on a test setup.
See the Full Analysis in Action
The Prompt
Please run an analysis on the Company table in HubSpot. I would like to see Companies grouped by Company Name and Street Address. Show me groups with more than one company ranked by descending order (count of companies descending).
The Petavue Workflow
01 / PLAN

Craft the Plan

Petavue starts by designing a clear, reviewable plan for the analysis and confirming any key assumptions before it runs.

In this case, the initial goal was to find duplicate companies in HubSpot using company name and address. When Petavue inspected the HubSpot schema, it detected that Street Address wasn’t an available field in the Companies table. Instead of forcing a choice, it surfaced alternative options for the user:

  • Company Name + City
  • Company Name + State/Region
  • Company Name + Postal Code
  • Company Name only

Petavue then asked the user to choose which option they preferred for identifying duplicates.

The user selected Company Name + State/Region as the most meaningful combination.

With that clarification captured, Petavue generated a concrete analysis plan:

  • Pull all records from the HubSpot Companies table (hubspot_companies)
  • Use name and state/region as the grouping keys
  • Count how many companies appear in each name + state/region group
  • Filter to show only groups with more than one record (potential duplicates)
  • Sort the groups in descending order by duplicate count
  • Return both the grouped data and the counts for full visibility

Nothing is executed until the user approves this plan, so they know exactly what logic will be applied to their data.

02 / VERIFY

Ensure Accurate Execution

After the user approved the plan, Petavue ran the analysis and validated the output.

The verification step surfaced key results:

  • 2,773 groups of companies shared the same name and state/region
  • Duplicate groups ranged from 2 to 6 records each
  • The highest-duplicate company names included (each appearing 6 times):
    • Hyatt LLC
    • Bayer LLC
    • Grimes and Sons
  • 55.1% of records in these duplicate groups were missing State/Region, indicating a major data completeness issue
  • States like Colorado, Utah, and Georgia showed the highest concentrations of duplicates

Petavue evaluates patterns, highlights missing or inconsistent data, and distinguishes between likely true duplicates and cases that may be legitimate multi-location companies.

03 / PRESENT

Surface the Insights

Once the analysis is verified, Petavue packages the results into an insight-rich, action-ready view.

Key Findings

  • Total duplicate groups: 2,773
  • Typical group size: 2 records (median), with some as high as 6
  • A high rate of missing State/Region values is making deduplication and segmentation harder
  • Certain states show a disproportionate share of duplicate companies

Business Impact

Petavue explains what this means for day-to-day operations:

  • Sales teams may contact the same company multiple times from different records
  • Account ownership and reporting become fragmented
  • Marketing campaigns can underperform when companies and contacts are split across duplicates
  • Leadership dashboards and account-based metrics lose reliability

Recommendations

To close the loop, Petavue offers concrete next steps:

  1. Clean Up Duplicates – Merge the 2,773 duplicate groups, starting with those that have 5–6 records.
  2. Tighten Data Entry – Make State/Region a required field to lower the risk of incomplete records.
  3. Automate Ongoing Management – Turn on and tune HubSpot’s duplicate-detection and merge tools.
  4. Improve Process & Training – Educate CRM users to search for existing companies before creating new ones.

Supporting Table

Alongside the narrative, Petavue provides a detailed results table showing:

  • Company name
  • Company state/region
  • Duplicate count for each group

Users can export this table directly into their cleanup workflows or use it to prioritize which duplicates to resolve first.