Communications of the ACMWhy Companies Keep Hoarding Data They Can’t Protect
I need identity architecture experts, CISOs, privacy engineers, and breach response specialists, among other experts. Story blurb and questions: In January 2026, researchers and news media confirmed a data breach affecting 72.7 million Under Armour customer records. Records included full names, email addresses, dates of birth, genders, purchase histories, and geographic locations. The Everest ransomware group claimed responsibility for the breach and published samples on the dark web, including customer and employee data. The question isn’t how sophisticated the attack was. It’s why a retail company was storing so much personal information.
Why are businesses collecting customer and employee data they can’t secure?
Enterprises stockpile personal data to gain better insights, drive more revenue, and gain a competitive edge. But the strategy has become an issue because the volume creates the vulnerability. The more data companies store, the bigger the target they become.
Businesses must rethink how they build identity systems, verify users, and measure success. Instead of collecting everything and protecting it later, some companies are collecting only what they need, using verification methods that don’t require long-term storage, and designing systems that make data exposure hard.
What technology lets businesses verify identity and deliver services without storing so much personal data? How do short-term verification and distributed authentication work in practice?
Who’s doing this at enterprise scale? What security improvements and cost-benefit numbers are they seeing?
When should companies delete data they’ve collected, and how do they balance minimization against eDiscovery requirements, legal holds, and business intelligence needs?
Why have businesses resisted data minimization despite breach costs averaging $10.22 million in the U.S. as per the IBM 2025 Cost of a Data Breach Report?
What company, technical, and money barriers prevent this?
How does data minimization work with privacy rules, AI training requirements, and biometric authentication that all seem to demand more data?
Where are the gaps in identity systems that make keeping unnecessary data certain, and what tools or frameworks could make storing less data the default?