

Data Minimization
Seek efficiencies in data + respect for privacy.
Purpose
- A data minimization strategy limits data collection, processing, and retention to only what’s necessary for specific business purposes.
- Data minimization requires strategic assessment of data types and purposes through the customer lifecycle.
- Responsible data practices reduce security risks, compliance burdens, storage costs, and potential legal liabilities. They also improve operational efficiency and build customer trust by clearly respecting privacy.
Method
- Resist groupthink
First, let go of the standard practice of collecting as much data about your customers as possible with the hopes that data can be monetized at some later date.
Data maximization is a lazy approach that runs on FOMO and sacrifices thoughtful decision-making on the altar of groupthink.
Your goal is to decrease (a) risks from data breaches, and (b) costs from massive data storage and maintaining interoperability as your company (and databases) grow. - Be intentional with what you choose to collect.
Implement purpose-driven collection, where every data point has a specific, documented business justification.
Create templates requiring you and the rest of the team to articulate what business question the data answers, how it contributes to revenue or cost reduction, and whether less sensitive data could fulfill the same purpose. - Retain what’s useful
Establish time-bound retention policies with automated deletion schedules based on data utility.
Implement a tiered system categorizing information. For example, classify data types as hot data (0-90 days for immediate business use), warm data (90-365 days for occasional reference), and cold data (1+ years with specific justification for regulatory or rare needs). - Defensive (data) driving
When determining what data is absolutely required, assess whether processing can occur locally rather than centrally. Leverage edge computing and local device processing where possible, and use differential privacy techniques to extract insights without collecting raw data.
For analytics purposes, default to anonymous aggregation by setting reporting at the aggregate level and requiring specific justification for individual-level data access. - If this, then that
Assume your company will, at some point, experience a data breach. What is the cost to your company when that happens in terms of both financial cost and brand identity? The global average cost of a data breach was $4.88 million in 2024. By retaining only essential records, a company can minimize the Impact of a breach if it occurs. - Mind the gap
If you want to enter the European market at some point, data minimization is a core GDPR principle requiring that personal data be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed.” Many other global markets are adopting GDPR-type data approaches as well. - Proactive beats reactive
Don’t wait to put data minimization in place. For a small company, a breach poses an existential risk.
Large companies have survived massive data breaches (Equifax 2017, Target 2013) and their stock prices rebounded within a year.
- Startups have limited resources, and unlike large public companies they usually lack adequate insurance coverage, financial reserves to absorb breach costs, robust backup systems and disaster recovery plans, and the legal resources to handle litigation and regulatory compliance.
- 60% of small companies go out of business within six months of falling victim to a data breach or cyber attack, and some small companies have experienced complete business destruction within hours (Code Spaces within 12 hours) or days (MyBizHomepage within 8 days), compared to large companies that often recover within a year.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
Trap Doors
- Maintain data for a reason…
Do maintain data that will be essential for analytics or customer insights, but vague “just in case” retention policies can lead to unnecessary data hoarding. - Iterate…
Embrace iteration as you work to determine the right middle ground. - Don’t assume every…
Don’t assume every part of the business will have the same data retention needs. - Don’t forget to…
Don’t forget to do a regular review on data storage policies that reflect internal and external constraints. - Don’t assume initial…
Don’t assume initial data minimization efforts will last forever; as the team expands, new collection practices will emerge in pockets of the company.
Most people have been trained to see all data as valuable.
Be ruthlessly analytical in determining which data has long-term value.


Cases
In 2019, 100 million credit card applications dating back to 2005 were stolen, impacting approximately 100 million people in the United States and over 6 million in Canada.
Capital One’s reputation took a hit. One might ask why credit card applications that were over a decade old needed to be stored as perpetual data.
Read more about this data breach in financial services.
Sources: McLean, Scott. “A hacker gained access to 100 million Capital One credit card applications and accounts” CNN, 2019. Available here. Capital One’s reputation and required substantial investment in improved cloud security measures. One might ask why credit card applications that were over a decade old needed to be stored as perpetual data.
Consider Solarwinds, one of the most infamous breaches (until the CrowdStrike incident of 2024) detailed in this report.
Sources:+ Bueno, Felipe. “Solarwinds Attack.” Harvard Kennedy School Belfer Center, 2021. Available at.
Who to Enlist
Front-end and back-end engineering both interact with data collection; get both functions to the table.
{{divider}}
Ask the Board and investors with information rights what regulatory or legal guardrails you should keep in mind for records retention.
{{divider}}
The person in charge of infrastructure will have insight into costs for scaling current data collection practices.
{{divider}}
If the company is later stage and has outside/in-house general counsel, ask them to ensure that your policies align with your commercial objectives.