The Challenge of Massive Binaries in Large-Scale Codebases
By
todsacerdoti
Pure flour-power. Hearty enough to carry you through lunch.
Summary
The article discusses the challenge of massive binary files (ELF files) in large-scale codebases, drawing from the author's personal experience during their PhD and industry work at Google. The author encountered binaries exceeding 25GiB with debug symbols, highlighting a problem that only exists at massive scale but is often dismissed by academic reviewers who claim such problems don't exist. The piece explores how these huge binaries come to be and the disconnect between industry-scale problems and academic recognition.
Key quotes
· 4 pulledOne problem that is only present at these mega-codebases is massive binaries.
I had observed binaries beyond 25GiB, including debug symbols.
Responses to my publication submissions often claimed such problems did not exist; however, I had observed them during my time within industry, such as at Google, but I couldn't cite it!
What's the largest binary (ELF file) you've ever seen?
You might also wanna read
Three Years In: A Senior Engineer's Reflection on AI's Impact on the Software Development Role
A senior engineer reflects on the long-term sustainability of AI tools in software development, three years into deep organizational adoptio
Three Years In: A Senior Engineer's Reflection on AI's Impact on the Software Development Role
A senior engineer reflects on the long-term sustainability of AI tools in software development, three years into deep organizational adoptio
Bijou64: A variable-length integer encoding that's both correct and accidentally fast
This article describes the development of bijou64, a variable-length integer (varint) encoding created for the Subduction CRDT sync protocol
Bijou64: A variable-length integer encoding that's both correct and accidentally fast
This article describes the development of bijou64, a variable-length integer (varint) encoding created for the Subduction CRDT sync protocol
Domain Expertise, Not Code, Is the True Competitive Advantage in Software
The article argues that true competitive advantage ("moat") in software has always been domain expertise—deep understanding of the business
A Formal Proof That Jira Is Turing-Complete via Minsky Machine Implementation
This article provides a formal proof that Jira (Atlassian's project-tracking tool) is Turing-complete by demonstrating how to build a Minsky
