One of my pet peeves with the usage of complex(ity) out of the traditional time/space in computer science is that most of the time the OPs of several articles over the internet do not make the distinction between boundaried/arbitrary complexity, where most of the time the person has most of the control of what is being implemented, and domain/accidental/environmental complexity, which is wide open and carries a lot of intrinsic and most of the time unsolvable constraints.
Yes, they are Google; yes, they have a great pool of talent around; yes, they do a lot of hard stuff; but most of the time when I read those articles, I miss those kinds of distinctions.
Not lowballing the guys at Google, they do amazing stuff, but in some domains of domain/accidental/environmental complexity (e.g. sea logistics, manufacturing, industry, etc.) where most of the time you do not have the data, I believe that they are way more complex/harder than most of the problems that the ones that they deal with.
The phrase thrown around was “collaboration headwind”, the idea was if project success depends on 1 person with a 95% chance of success, project success also had a 95% chance. But if 10 people each need to succeed at a 95% chance, suddenly the project success likelihood becomes 60%…
In reality, lazy domain owners layered on processes, meetings, documents, and multiple approvals until it took 6 months to change the text on a button, ugh
Another side of this coin is that the expected payoff from a project depends on how many unrelated projects your organization is engaging in, which is deeply counterintuitive to most people.
Every project carries with it three possibilities: that of success, where the company makes money, that of failure, where the company does not, and that of a "critical failure", where the project goes so wrong that it results in a major lawsuit, regulatory fine or PR disaster that costs the company more than the project was ever expected to make.
If you're a startup, the worst that can happen to your company is the value going to 0. From an investor's perspective, there's not much of a difference between burning all the money ($10m) and not finding product-market-fit (normal failure), or your company getting sued for $3b and going bankrupt (critical failure). The result is the same, the investment is lost. For a large corporation, a $3b lawsuit is far more costly than sinking $10m into a failed project.
You can trade off these three possibilities against each other. Maybe forcing each release through an arduous checklist of legal review or internationalization and accessibility testing decreases success rates by 10%, but moves the "critical failure rate" from 1% to 0.5%. From a startup's perspective, this is a bad tradeoff, but if you're a barely-profitable R&D project at big co, the checklist is the right call to make.
This problem is independent from all the other causes to which bureaucracy is usually attributed, like the number of layers of management, internal culture, or "organizational scar tissue." Just from a legal and brand safety perspective, the bigger your org, the more bureaucracy makes sense, no matter how efficient you can get your org to be.
Interesting. As a consultant for the most of the last 25 years, my experience is the domain owners are typically invested and have strong opinions on the things that impact their jobs.
Executive leadership, on the other hand, doesn't want to actually know the issues and eyes glaze over as they look at their watches because they have a tee time.
There's a culture of "I won't approve this unless it does something for me" at Google. So now changing the text on a button comes with 2 minor refactors, 10 obvious-but-ignored bugfixes, and 5 experiments that it is actually better.
Well, when the owner asks for a whole test suite that didn't exist to get a fix in, what most likely happens is that you just wasted your time in a draft CL that will get lost.
They aren't asking for you to write tests because 'it benefits them', they are asking you to write tests because as a professional engineer, you should write tests, and not just yolo it.
Look, sometimes you may have good reasons for why a test is impractical. You are allowed to push back, or look for a different reviewer. There's a hundred thousand people in the firm, you should be able to find one or two that will let you submit literally anything that compiles.
But most of the time, the reviewer is giving you advice that you should take.
If you are turning a button to a slightly different shade of blue and it's not a button you own, the owner of the button should not be asking you to write tests for the button.
Well, good management/tech leadership is about making sure that the risks coming from individual failure points (10 people in your example) are recognized and mitigated, and that the individuals involved can flag risks and conflicts early enough so that the overall project success probability does not go down as you describe...
The assumptions in that math are wrong anyway. Once you depend on 10 people, the chance that they each achieve "95% successful execution" is 0.
This is only partially down to the impossibility of having every staff member on a project be A++ players.
There is coordination RISK not just coordination overhead. Think planning a 2 week trip with your spouse with multiple planes/trains/hotels, museum/exhibit ticket bookings, meal reservations, etc. Inevitably something gets misunderstood/miscommunicated between the two of you and therefore mis-implemented.
Now add more communication nodes to the graph and watch the error rate explode.
That's what the math is reflecting. Project succeeds if all of 10 people does their job well. Each person has a 95% chance of succeeding. 0.95^10 ~= 60%, and so the chance that all 10 people do their job successfully is ~60%.
Those jobs also include things like management and product design, and so the coordination risk is reflected in the 5% chance that the manager drops the ball on communication. (As a manager, I suspect that chance is significantly more than 5% and that's why overall success rates are even lower.)
And when you’re at a smaller company 90% of your time is fighting societal complexity, limit of which also approaches infinity, but at a steeper angle.
No greater Scott’s man can tell you that the reality is surprisingly complex, and sometimes you have resources to organize and fight them, and sometimes you use those resources wiser than the other group of people, and can share the lessons. Sometimes, you just have no idea if your lesson is even useful. Let’s judge the story on its merits and learn what we can from it.
Look, I've never had to design, build or maintain systems at the scale of a FAANG, but that doesn't mean I haven't been involved in pretty complicated systems (e.g., 5000 different pricing and subsidy rules for 5000 different corporate clients with individually negotiated hardware subsidies (changing all the time) and service plans, commission structure, plus logistics, which involves not only shipping but shipping to specific departments for configuration before the device goes to the employee, etc.
Arbitrarily, 95% of the time the issues were people problems, not technical ones.
You are right but it misses the flavor of the problem. I was a consultant in infosec to F500s for many years. Often solving a problem involves simply knowing the right person that has already thought about it or toiled on the problem or a similar one. But when there are 100,000 engineers it becomes an order of magnitude (or two!) more difficult and that puts forth unique challenges. You can still call them “people problems” and they often may be. However if you try to solve them the same way you might solve it at a smaller engineering org you will get and be nowhere and be poorer for the time you spent trying it. Ask me how I know lol. The technical problems are also like that. Almost everything has an analog or similar thing to what you are probably familiar with but it is scaled out, has a lot of unfamiliar edges and is often just different enough that you have to adjust your reasoning model. Things you can just do at even a typical f500 you can’t just do at big tech scale. Anyway, you are directionally correct and many of these wounds are self inflicted. But running a company like Google or Facebook is ridiculously hard and there are no easy answers, we just do our best.
I have a similar perspective. I think after a few years, it's the people things that have always been the hardest part of the job. That's probably why in the interviews, we always say things like: communication is key, culture fit, etc.
On the other hand, the good part of the job is solving complex technical problem with a team.
I think this is addressed with the complex vs complicated intro. Most problems with uncontrolled / uncontrollable variables will be approached with an incremental solution, e.g. you'll restrict those variables voluntarily or involuntarily and let issues being solved organically / manually, or automatisation will be plain and simple being abandoned.
This qualify as complicated. Delving in complicated problems is mostly driven by business opportunity, always has limited scaling, and tend to be discarded by big players.
I don't think it is, because the intro gets it wrong. If a problem's time or space complexity increases from O(n^2) to O(n^3) there's nothing necessarily novel about that, it's just... more.
Complicated on the other hand, involves the addition of one or more complicating factors beyond just "the problem is big". It's a qualitative thing, like maybe nobody has built adequate tools for the problem domain, or maybe you don't even know if the solution is possible until you've already invested quite a lot towards that solution. Or maybe you have to simultaneously put on this song and dance regarding story points and show continual progress even though you have not yet found a continuous path from where you are to your goal.
Climate change is both, doing your taxes is (typically) merely complex. As for complicated-but-not-complex, that's like realizing that you don't have your wallet after you've already ordered your food: qualitatively messy, quantitatively simple.
To put it differently, complicated is about the number of different domains you have to consider, complex is about--given some domain--how difficult the consideration in that domain are.
Perhaps the author's usage is common enough in certain audiences, but it's not consistent with how we discuss computational complexity. Which is a shame since they are talking about solving problems with computers.
I don't think this is adequately addressed by the "complicated vs. complex" framing—especially not when the distinction is made using reductive examples like taxes (structured, bureaucratic, highly formalized) versus climate change (broad, urgent, signaling-heavy).
That doesn’t feel right.
Let me bring a non-trivial, concrete example—something mundane: “ePOD,” which refers to Electronic Proof of Delivery.
ePOD, in terms of technical implementation, can be complex to design for all logistics companies out there like Flexport, Amazon, DHL, UPS, and so on.
The implementation itself—e.g., the box with a signature open-drawing field and a "confirm" button—can be as complex as they want from a pure technical perspective.
Now comes, for me at least, the complex part: in some logistics companies, the ePOD adoption rate is circa 46%. In other words, in 54% of all deliveries, you do not have a real-time (not before 36–48 hours) way to know and track whether the person received the goods or not. Unsurprisingly, most of those are still done on paper. And we have:
- Truck drivers are often independent contractors.
- Rural or low-tech regions lack infrastructure.
- Incentive structures don’t align.
- Digitization workflows involve physical paper handoffs, WhatsApp messages, or third-party scans.
So the real complexity isn't only "technical implementation of ePOD" but; "having the ePOD, how to maximize it's adoption/coverage with a lot uncertainty, fragmentation, and human unpredictability on the ground?".
That’s not just complicated, it’s complex 'cause we have:
- Socio-technical constraints,
- Behavioral incentives,
- Operational logistics,
- Fragmented accountability,
- And incomplete or delayed data.
We went off the highly controlled scenario (arbitrarily bounded technical implementation) that could be considered complicated (if we want to be reductionist, as the OP has done), and now we’re navigating uncertainty and N amount of issues that can go wrong.
I've not seen "accidental" complexity used to mean "domain" (or "environmental" or "inherent") complexity before. It usually means "the complexity you created for yourself and isn't fundamental to the problem you're solving"
Also, anything you do with enterprise (cloud) customers. People like to talk about scale a lot and data people tend to think about individual (distributed) systems that can go webscale. A single system with many users is still a single system. In enterprise you have two additional types of scale:
1) scale of application variety (10k different apps with different needs and history)
2) scale of human capability (ingenuity), this scale starts from sub-zero and can go pretty high (but not guaranteed)
Im a HW engineer and don't really understand "complexity" as far as this article describes it. I didn't read it in depth but it doesn't really give any good examples with specifics. Can someone give a detailed example of what the author is really talking about?
The idea is if someone helps you in a really big way that you’re able to reward that. So you can ask the company to give the person either credits for an internal store, or a direct addition to their salary for one month.
Obviously, there are limits to how many pay bonuses you can give out and if it’s direct money or store credits.
Directly asking for a peer bonus’ is not very “googly” (and yes, this is a term they use- in case you needed evidence of Google being a bit cultish).
My last workplace had a similar institution, only the reward was candy bar or similar that you could go grab from a bowl in the kitchen (working on an honor code basis), in addition to getting some praise on Slack for general warm fuzzies. It was more of a symbolic gesture for recognizing small everyday things, of course, but it was nice IMO.
Probably referring to the fact that they only rewarded them with a candy bar for being a good employee. Which ignores the fact that they're already probably getting paid a decent salary to do their job, and being a good employee is already part of the job description to receive said salary. Anything extra is nice.
Yeah. The chocolate was of course a triviality, more important was the idea of encouraging people to give public thanks and the associated (extremely immaterial) karma points when thanks are due. In this culture (Finnish) we're perhaps not very good at giving praise, and even worse at receiving it, so it helps to have an established ritual for doing so. Also, I think at least one of the original goals was to mitigate the silo effect and encourage people to help their coworkers in other projects and such.
> The idea is if someone helps you in a really big way that you’re able to reward that
It never ceases to amaze me how (early) big tech embraced and even promoted things that would have been considered "career limiting" in traditional big corporations.
By systematising/gamifying this stuff you actually help distract people from participating in the realpolitik going on within the executive team. If you stop other non-exec level realising the real way power is exercised within the company with these distractions it removes a potentially very large pool of competitors for power within the org.
Don't know about your flavor of 'traditional big corporations' but my banking megacorp has internal reward system across various 'virtues' for a decade+ at least. Its not direct reward -> money link (thats rather for hiring success), it just helps you create sort of karma, and when bonuses, raises and promotions are considered then this is taken into account.
Since that process is invisible to those being measured you never know details (and shouldn't as long as management is sane, and if isn't this the least of your concerns), but its not ignored and in this way it helps keeping people motivated to generally do good work.
Big bank. Management theory at the time was to create competition between the silos for resources, time, budget, headcount, good desk locations in the bi-annual room desk shuffle, bonuses and even time of day from management. Even sales and trading - the most symbiotic of functions competed.
I was in Kindergarten and watching my fellow classmates get gold star stickers on their work. They were excited when it happened to them. I saw it as being given nothing of real value and person could just go to the store and buy them for $1 or $2.
It is a social engineering technique to exploit more work without increasing wages. Just like "Employee of the Month" or a "Pizza Party."
Company I work for does this with gift cards as rewards. I was reprimanded because I sent an email to HR that this " gift" is as useful as a wet rage in the rain. I don't eat at restaurants that are franchises or have a ticker on Wall Street. Prefer local brick and mortar over Walmart and will never financial support Amazon.
If you want to truly honor my accomplishments, give me a raise or more PTO. Anything else is futile. That gift card to Walmart has 0 value towards a quality purchase like a RADAR or LiDAR development kit to learn more or such.
At a previous company I worked at, peer bonuses literally resulted in a small bonus at the end of the pay period. No gift card, just an email notification and money credited to your account. Most motivating form of peer appreciation I've seen.
Basically a way to "tip" people for going out of their way to help you, except that the "tip" comes out of the company's pocket, not yours.
To prevent obvious abuse, you need to provide a rationale, the receiver's manager must approve and there's a limit to how many you can dish out per quarter.
Bonuses make a lot of sense in the financial sector, because the whole endeavor is about making money. Intrinsic motivation and making more money align. Historically it got introduced in order to mitigate cheating customers for personal gain. Also it helps that individual contributions are trivially quantifiable to a very large degree.
Obviously there are other professions that share some of these characteristics, like sales. Or if you narrow down a goal or task to "save us money".
>“the difficult we do immediately. The impossible takes a little longer”
This was posted in my front office when I started my company over 30 years ago.
It was a no-brainer, same thing I was doing for my employer beforehand. Experimentation.
By the author's distinction in the terminology, if you consider the complexity relative to the complications in something like Google technology, it is on a different scale compared to the absolute chaos relative to the mere remaining complexity when you apply it to natural science.
I learned how to do what I do directly from people who did it in World War II.
And that was when I was over 40 years younger, plus I'm not done yet. Still carrying the baton in the industrial environment where the institutions have a pseudo-military style hierarchy and bureaucracy. Which I'm very comfortable working around ;)
Well, the army is a massive mainstream corp.
There are always some things that corps don't handle very well, but generals don't always care, if they have overwhelming force to apply, lots of different kinds of objectives can be overcome.
Teamwork, planning, military-style discipline & chain-of-command/org-chart, strength in numbers, all elements which are hallmarks of effective armies over the centuries.
The engineers are an elite team among them. Traditionally like the technology arm, engaged to leverage the massive resources even more effectively.
The bigger the objective, the stronger these elements will be brought to bear.
Even in an unopposed maneuver, steam-rolling all easily recognized obstacles more and more effectively as they up the ante, at the same time bigger and bigger unscoped problems accumulate which are exactly the kind that can not be solved with teamwork and planning (since these are often completely forbidden). When there must be extreme individual ability far beyond that, and it must emanate from the top decision-maker or have "equivalent" access to the top individual decision-maker. IOW might as well not even be "in" the org chart since it's just a few individuals directly attached to the top square, nobody's working for further promotions or recognition beyond that point.
When military discipline in practice is simply not enough discipline, and not exactly the kind that's needed by a long shot.
That's why even in the military there are a few Navy Seals here and there, because sometimes there are serious problems that are the kind of impossible that a whole army cannot solve ;)
> My immediate reaction in my head was: "This is impossible". But then, a teammate said: "But we're Google, we should be able to manage it!".
"We can do it!" confidence can be mostly great. (Though you might have to allow for the possibility of failure.)
What I don't have a perfect rule for is how to avoid that twisting into arrogance and exceptionalism.
Like, "My theory is correct, so I can falsify this experiment."
Or "I have so much career potential, it's to everyone's advantage for me to cheat to advance."
Or "Of course we'll do the right thing with grabbing this unchecked power, since we're morally superior."
Or "We're better than those other people, and they should be exterminated."
Maybe part of the solution is to respect the power of will, effort, perseverance, processes, etc., but to be concerned when people don't also respect the power and truth of humility, and start thinking of individual/group selves as innately superior?
I think there are two myths applicable here. Probably more.
One myth is that complex systems are inherently bad. Armed forces are incredibly complex. That's why it can take 10 or more rear echelon staff to support one fighting soldier. Supply chain logistics and materiel is complex. Middle ages wars stopped when gunpowder supplies ran out.
Another myth is that simple systems are always better and remain simple. They can be, yes. After all, DNA exists. But some beautiful things demand complexity built up from simple things. We still don't entirely understand how DNA and environment combine. Much is hidden in this simple system.
I do believe one programming language might be a rational simplification. If you exclude all the DSL which people implement to tune it.
> Middle ages wars stopped when gunpowder supplies ran out.
The arquebus is the first mass gunpowder weapon, and doesn't see large scale use until around the 1480s at the very, very tail end of the Middle Ages (the exact end date people use varies based on topic and region, but 1500 is a good, round date for the end).
In Medieval armies, your limiting factor is generally that food is being provided by ransacking the local area for food and that a decent portion of your army is made up of farmers who need to be back home in the harvest season. A highly competent army might be able to procure food without acting as a plague on all the local farmlands, but most Medieval states lacked sufficient state capacity to manage that (in Europe, essentially only the Byzantines could do that).
Following the definition from the article, armed forces seems like a complicated system, not a complex one. There is a structured, repeatable solution for armed forces. It does not exhibit the hallmark characteristics of complex systems listed in the article like emergent behaviors.
Agreed. The problem is not complexity. Every system must process a certain amount of information. And the systems complexity must be able to match that amount. The fundamental problem is about designing systems that can manage complexity, especially runaway complexity.
> Middle ages wars stopped when gunpowder supplies ran out
Ukraine would be conquered by russia rather quickly if russians weren't so hilariously incompetent in these complex tasks, and war logistics being the king of them. Remember that 64km queue of heavy machinery [1] just sitting still? This was 2022, and we talk about fuel and food, the basics of logistics support.
although I understood the key part of a system being complex (as opposed to complicated) is having a large number of types of interaction. So a system with a large number of parts is not enough, those parts have to interact in a number of different ways for the system to exhibit emergent effects.
Something like that. I remember reading a lot of books about this kind of thing a while ago :)
Except computers attempt to model mathematics in an ideal world.
Unless your problem comes from something side effects on a computer that can’t be modeled mathematically there is nothing technically stopping you from modeling the problem as mathematical problem then solving that problem via mathematics.
Like the output of the LLM can’t be modeled. We literally do not understand it. Are the problems faced by the SRE exactly the same? You give a system an input of B and you can’t predict the output of A mathematically? It doesn’t even have to be a single equation. A simulation can do it.
I think the vast majority of SRE problems are in the “side effects” category. But higher level than the hardware-level side effects of the computer that you might be imagining.
The core problem is building a high enough fidelity model to simulate enough of the real world to make the simulation actually useful. As soon as you have some system feedback loops, the complexity of building a useful model skyrockets.
Even in “pure” functions, the supporting infrastructure can be hard to simulate and critical in affecting the outputs.
Even doing something simple like adding two numbers requires an unimaginable amount of hidden complexity under the hood. It is almost impossible for these things to not have second-order effects and emergent behaviour under enough scale.
"This is one possible characteristic of complex systems: they behave in ways that can hardly be predicted just by looking at their parts, making them harder to debug and manage."
To be honest this doesn't sound too different from many smaller and medium sized internetprojects i've worked on, because of the asynchronous nature of the web, with promises, timing issues and race conditions leading to weirdness that's pretty hard to debug because you have to "playback" with the cascading randomness of request timing, responses, encoding, browser/server shenanigans etc.
Just because your project might not be at Google's scale doesn't mean it is therefore also not complex [^1]
Example: I'd say plenty of games fit the author's definition of "complex systems". Even the well-engineered ones (and even some which could fit on a floppy disc)
Google has a really hard time grokking the games industry, to the point they can hire people from it and just almost totally ignore them. Their ideas on how Android game development should be done were utterly hilarious, and it's only because of a couple of their dev relations people going to ludicrous lengths that it is actually viable at all.
Fundamentally, and ironically, Google likes to offload complexity on to everyone else in their ecosystems, and they got so used to people being willing to jump through hoops to do this for search ads/SEO they are very confused when faced with a more competitive environment.
One reason Google can't make games is they can't conceive of a simple enough platform on which to design and develop one. It would be a far too adventurous constantly moving target of wildly different specifications, and they would insist you support all possible permutations of everything from the start. There are reasons people like targeting games consoles, as it lets you focus on the important bits first.
>B. Google lacking the same persistence of Amazon (Consider all the products that are killed)
Yah, like the Stadia, Google's streaming gaming console thing. They even had a first party game development division for it. So exactly what OP was wondering about.
IMO even a more interesting observation is that even Google itself doesn't necessarily work on large scale, e.g. many regionalised services in Google Cloud don't have _that_ many requests in each region, allowing for a much simpler architecture compared to behemoths like GMail or Maps
IMO what we term "complex" tends to be that which the current setup/system struggles to deal with or manage. Relatively speaking google has much much higher complexity, but it doesnt matter as much, because even in simpler cases we are dealing with huge amount of variety and possible states, and the principles of managing that remain the same regardless of scale.
For small scale one can build a simple system but I see many are trying to copy FAANG architecture anyway. IMHO it’s a fallacy - people think that if they’ll would copy architecture used by google their company will be successful like google. I think it other was around - google has to build complex systems because it has many users.
It’s an infectious disease among developers. Some people would spend weeks making a simple landing page, and it would require at least 3 different cloud services.
One of my pet peeves with the usage of complex(ity) out of the traditional time/space in computer science is that most of the time the OPs of several articles over the internet do not make the distinction between boundaried/arbitrary complexity, where most of the time the person has most of the control of what is being implemented, and domain/accidental/environmental complexity, which is wide open and carries a lot of intrinsic and most of the time unsolvable constraints.
Yes, they are Google; yes, they have a great pool of talent around; yes, they do a lot of hard stuff; but most of the time when I read those articles, I miss those kinds of distinctions.
Not lowballing the guys at Google, they do amazing stuff, but in some domains of domain/accidental/environmental complexity (e.g. sea logistics, manufacturing, industry, etc.) where most of the time you do not have the data, I believe that they are way more complex/harder than most of the problems that the ones that they deal with.
I’d wager 90% time spent at Google is fighting incidental organizational complexity, which is virtually unlimited.
The phrase thrown around was “collaboration headwind”, the idea was if project success depends on 1 person with a 95% chance of success, project success also had a 95% chance. But if 10 people each need to succeed at a 95% chance, suddenly the project success likelihood becomes 60%…
In reality, lazy domain owners layered on processes, meetings, documents, and multiple approvals until it took 6 months to change the text on a button, ugh
Another side of this coin is that the expected payoff from a project depends on how many unrelated projects your organization is engaging in, which is deeply counterintuitive to most people.
Every project carries with it three possibilities: that of success, where the company makes money, that of failure, where the company does not, and that of a "critical failure", where the project goes so wrong that it results in a major lawsuit, regulatory fine or PR disaster that costs the company more than the project was ever expected to make.
If you're a startup, the worst that can happen to your company is the value going to 0. From an investor's perspective, there's not much of a difference between burning all the money ($10m) and not finding product-market-fit (normal failure), or your company getting sued for $3b and going bankrupt (critical failure). The result is the same, the investment is lost. For a large corporation, a $3b lawsuit is far more costly than sinking $10m into a failed project.
You can trade off these three possibilities against each other. Maybe forcing each release through an arduous checklist of legal review or internationalization and accessibility testing decreases success rates by 10%, but moves the "critical failure rate" from 1% to 0.5%. From a startup's perspective, this is a bad tradeoff, but if you're a barely-profitable R&D project at big co, the checklist is the right call to make.
This problem is independent from all the other causes to which bureaucracy is usually attributed, like the number of layers of management, internal culture, or "organizational scar tissue." Just from a legal and brand safety perspective, the bigger your org, the more bureaucracy makes sense, no matter how efficient you can get your org to be.
> lazy domain owners
Interesting. As a consultant for the most of the last 25 years, my experience is the domain owners are typically invested and have strong opinions on the things that impact their jobs.
Executive leadership, on the other hand, doesn't want to actually know the issues and eyes glaze over as they look at their watches because they have a tee time.
There's a culture of "I won't approve this unless it does something for me" at Google. So now changing the text on a button comes with 2 minor refactors, 10 obvious-but-ignored bugfixes, and 5 experiments that it is actually better.
While this sounds pretty frustrating, there is at least a small upside: at least you get to the obvious-but-ignored bugfixes.
Most smaller places don’t have the bandwidth and many larger ones don’t have the desire.
I’m not sure if that makes up for bugs potentially introduced in the refactors, though.
Well, when the owner asks for a whole test suite that didn't exist to get a fix in, what most likely happens is that you just wasted your time in a draft CL that will get lost.
Do you mean the relevant code area(s) didn't have (sufficient) tests? You're being asked to backfill those missing tests in addition to your fix?
They aren't asking for you to write tests because 'it benefits them', they are asking you to write tests because as a professional engineer, you should write tests, and not just yolo it.
Look, sometimes you may have good reasons for why a test is impractical. You are allowed to push back, or look for a different reviewer. There's a hundred thousand people in the firm, you should be able to find one or two that will let you submit literally anything that compiles.
But most of the time, the reviewer is giving you advice that you should take.
If you are turning a button to a slightly different shade of blue and it's not a button you own, the owner of the button should not be asking you to write tests for the button.
Well, good management/tech leadership is about making sure that the risks coming from individual failure points (10 people in your example) are recognized and mitigated, and that the individuals involved can flag risks and conflicts early enough so that the overall project success probability does not go down as you describe...
The old "If you want to go fast, go alone. If you want to go far, go together."
Also why the optimal business strategy seems to be to go as far as you can alone and then bring on other people when you're running out of steam.
Coordination Headwind: https://komoroske.com/slime-mold/
The assumptions in that math are wrong anyway. Once you depend on 10 people, the chance that they each achieve "95% successful execution" is 0.
This is only partially down to the impossibility of having every staff member on a project be A++ players.
There is coordination RISK not just coordination overhead. Think planning a 2 week trip with your spouse with multiple planes/trains/hotels, museum/exhibit ticket bookings, meal reservations, etc. Inevitably something gets misunderstood/miscommunicated between the two of you and therefore mis-implemented.
Now add more communication nodes to the graph and watch the error rate explode.
That's what the math is reflecting. Project succeeds if all of 10 people does their job well. Each person has a 95% chance of succeeding. 0.95^10 ~= 60%, and so the chance that all 10 people do their job successfully is ~60%.
Those jobs also include things like management and product design, and so the coordination risk is reflected in the 5% chance that the manager drops the ball on communication. (As a manager, I suspect that chance is significantly more than 5% and that's why overall success rates are even lower.)
That's what I mean "only 5%" encapsulating all failure modes (comms/implementation/coordination/etc) is very low.
And that under-estimation compounds to make the top level 60% much higher than it should be.
A 7.5% rate takes top-level success odds below 50% - 46%. A not unrealistic 10% takes the top level down to 35%.
Etc.
And when you’re at a smaller company 90% of your time is fighting societal complexity, limit of which also approaches infinity, but at a steeper angle.
No greater Scott’s man can tell you that the reality is surprisingly complex, and sometimes you have resources to organize and fight them, and sometimes you use those resources wiser than the other group of people, and can share the lessons. Sometimes, you just have no idea if your lesson is even useful. Let’s judge the story on its merits and learn what we can from it.
Look, I've never had to design, build or maintain systems at the scale of a FAANG, but that doesn't mean I haven't been involved in pretty complicated systems (e.g., 5000 different pricing and subsidy rules for 5000 different corporate clients with individually negotiated hardware subsidies (changing all the time) and service plans, commission structure, plus logistics, which involves not only shipping but shipping to specific departments for configuration before the device goes to the employee, etc.
Arbitrarily, 95% of the time the issues were people problems, not technical ones.
You are right but it misses the flavor of the problem. I was a consultant in infosec to F500s for many years. Often solving a problem involves simply knowing the right person that has already thought about it or toiled on the problem or a similar one. But when there are 100,000 engineers it becomes an order of magnitude (or two!) more difficult and that puts forth unique challenges. You can still call them “people problems” and they often may be. However if you try to solve them the same way you might solve it at a smaller engineering org you will get and be nowhere and be poorer for the time you spent trying it. Ask me how I know lol. The technical problems are also like that. Almost everything has an analog or similar thing to what you are probably familiar with but it is scaled out, has a lot of unfamiliar edges and is often just different enough that you have to adjust your reasoning model. Things you can just do at even a typical f500 you can’t just do at big tech scale. Anyway, you are directionally correct and many of these wounds are self inflicted. But running a company like Google or Facebook is ridiculously hard and there are no easy answers, we just do our best.
Fair, but just in case, the system I used as an anecdote is operated for a company that has 45,000+ direct employees and $25 billion annual revenue.
I have a similar perspective. I think after a few years, it's the people things that have always been the hardest part of the job. That's probably why in the interviews, we always say things like: communication is key, culture fit, etc.
On the other hand, the good part of the job is solving complex technical problem with a team.
Equally important is the amount of time they save because of available abstractions to use like infra, tooling etc
I think this is addressed with the complex vs complicated intro. Most problems with uncontrolled / uncontrollable variables will be approached with an incremental solution, e.g. you'll restrict those variables voluntarily or involuntarily and let issues being solved organically / manually, or automatisation will be plain and simple being abandoned.
This qualify as complicated. Delving in complicated problems is mostly driven by business opportunity, always has limited scaling, and tend to be discarded by big players.
I don't think it is, because the intro gets it wrong. If a problem's time or space complexity increases from O(n^2) to O(n^3) there's nothing necessarily novel about that, it's just... more.
Complicated on the other hand, involves the addition of one or more complicating factors beyond just "the problem is big". It's a qualitative thing, like maybe nobody has built adequate tools for the problem domain, or maybe you don't even know if the solution is possible until you've already invested quite a lot towards that solution. Or maybe you have to simultaneously put on this song and dance regarding story points and show continual progress even though you have not yet found a continuous path from where you are to your goal.
Climate change is both, doing your taxes is (typically) merely complex. As for complicated-but-not-complex, that's like realizing that you don't have your wallet after you've already ordered your food: qualitatively messy, quantitatively simple.
To put it differently, complicated is about the number of different domains you have to consider, complex is about--given some domain--how difficult the consideration in that domain are.
Perhaps the author's usage is common enough in certain audiences, but it's not consistent with how we discuss computational complexity. Which is a shame since they are talking about solving problems with computers.
I don't think this is adequately addressed by the "complicated vs. complex" framing—especially not when the distinction is made using reductive examples like taxes (structured, bureaucratic, highly formalized) versus climate change (broad, urgent, signaling-heavy).
That doesn’t feel right.
Let me bring a non-trivial, concrete example—something mundane: “ePOD,” which refers to Electronic Proof of Delivery.
ePOD, in terms of technical implementation, can be complex to design for all logistics companies out there like Flexport, Amazon, DHL, UPS, and so on.
The implementation itself—e.g., the box with a signature open-drawing field and a "confirm" button—can be as complex as they want from a pure technical perspective.
Now comes, for me at least, the complex part: in some logistics companies, the ePOD adoption rate is circa 46%. In other words, in 54% of all deliveries, you do not have a real-time (not before 36–48 hours) way to know and track whether the person received the goods or not. Unsurprisingly, most of those are still done on paper. And we have:
- Truck drivers are often independent contractors.
- Rural or low-tech regions lack infrastructure.
- Incentive structures don’t align.
- Digitization workflows involve physical paper handoffs, WhatsApp messages, or third-party scans.
So the real complexity isn't only "technical implementation of ePOD" but; "having the ePOD, how to maximize it's adoption/coverage with a lot uncertainty, fragmentation, and human unpredictability on the ground?".
That’s not just complicated, it’s complex 'cause we have: - Socio-technical constraints,
- Behavioral incentives,
- Operational logistics,
- Fragmented accountability,
- And incomplete or delayed data.
We went off the highly controlled scenario (arbitrarily bounded technical implementation) that could be considered complicated (if we want to be reductionist, as the OP has done), and now we’re navigating uncertainty and N amount of issues that can go wrong.
I've not seen "accidental" complexity used to mean "domain" (or "environmental" or "inherent") complexity before. It usually means "the complexity you created for yourself and isn't fundamental to the problem you're solving"
Also, anything you do with enterprise (cloud) customers. People like to talk about scale a lot and data people tend to think about individual (distributed) systems that can go webscale. A single system with many users is still a single system. In enterprise you have two additional types of scale:
1) scale of application variety (10k different apps with different needs and history)
2) scale of human capability (ingenuity), this scale starts from sub-zero and can go pretty high (but not guaranteed)
Im a HW engineer and don't really understand "complexity" as far as this article describes it. I didn't read it in depth but it doesn't really give any good examples with specifics. Can someone give a detailed example of what the author is really talking about?
Rich Hickey is famous for talking about easy vs. simple/complex and essential vs. incidental complexity.
“Simple Made Easy”: https://youtu.be/SxdOUGdseq4?si=H-1tyfL881NawCPA
> My immediate reaction in my head was: "This is impossible". But then, a teammate said: "But we're Google, we should be able to manage it!".
Google, where the impossible stuff is reduced to merely hard, and the easy stuff is raised to hard.
This is probably the most accurate statement possible.
“I just want to store 5TiB somewhere”
“Ha! Did you book multiple bigtable cells”
https://youtu.be/3t6L-FlfeaI?si=C5PJcrvLepABZsVF
What are peer-bonuses?
The idea is if someone helps you in a really big way that you’re able to reward that. So you can ask the company to give the person either credits for an internal store, or a direct addition to their salary for one month.
Obviously, there are limits to how many pay bonuses you can give out and if it’s direct money or store credits.
Directly asking for a peer bonus’ is not very “googly” (and yes, this is a term they use- in case you needed evidence of Google being a bit cultish).
There are companies who help do this “as a service”; https://bonusly.com/
My last workplace had a similar institution, only the reward was candy bar or similar that you could go grab from a bowl in the kitchen (working on an honor code basis), in addition to getting some praise on Slack for general warm fuzzies. It was more of a symbolic gesture for recognizing small everyday things, of course, but it was nice IMO.
Capitalism in action!
I don’t get the connection to capitalism here. Care to elaborate?
Probably referring to the fact that they only rewarded them with a candy bar for being a good employee. Which ignores the fact that they're already probably getting paid a decent salary to do their job, and being a good employee is already part of the job description to receive said salary. Anything extra is nice.
Yeah. The chocolate was of course a triviality, more important was the idea of encouraging people to give public thanks and the associated (extremely immaterial) karma points when thanks are due. In this culture (Finnish) we're perhaps not very good at giving praise, and even worse at receiving it, so it helps to have an established ritual for doing so. Also, I think at least one of the original goals was to mitigate the silo effect and encourage people to help their coworkers in other projects and such.
> The idea is if someone helps you in a really big way that you’re able to reward that
It never ceases to amaze me how (early) big tech embraced and even promoted things that would have been considered "career limiting" in traditional big corporations.
By systematising/gamifying this stuff you actually help distract people from participating in the realpolitik going on within the executive team. If you stop other non-exec level realising the real way power is exercised within the company with these distractions it removes a potentially very large pool of competitors for power within the org.
> would have been considered "career limiting" in traditional big corporations.
How so?
Don't know about your flavor of 'traditional big corporations' but my banking megacorp has internal reward system across various 'virtues' for a decade+ at least. Its not direct reward -> money link (thats rather for hiring success), it just helps you create sort of karma, and when bonuses, raises and promotions are considered then this is taken into account.
Since that process is invisible to those being measured you never know details (and shouldn't as long as management is sane, and if isn't this the least of your concerns), but its not ignored and in this way it helps keeping people motivated to generally do good work.
Big bank. Management theory at the time was to create competition between the silos for resources, time, budget, headcount, good desk locations in the bi-annual room desk shuffle, bonuses and even time of day from management. Even sales and trading - the most symbiotic of functions competed.
Credits to the store? I have never heard of this or seen that.
I wasn't aware of bonuses-as-a-service. Thanks for sharing.
I was in Kindergarten and watching my fellow classmates get gold star stickers on their work. They were excited when it happened to them. I saw it as being given nothing of real value and person could just go to the store and buy them for $1 or $2.
It is a social engineering technique to exploit more work without increasing wages. Just like "Employee of the Month" or a "Pizza Party."
Company I work for does this with gift cards as rewards. I was reprimanded because I sent an email to HR that this " gift" is as useful as a wet rage in the rain. I don't eat at restaurants that are franchises or have a ticker on Wall Street. Prefer local brick and mortar over Walmart and will never financial support Amazon.
If you want to truly honor my accomplishments, give me a raise or more PTO. Anything else is futile. That gift card to Walmart has 0 value towards a quality purchase like a RADAR or LiDAR development kit to learn more or such.
At a previous company I worked at, peer bonuses literally resulted in a small bonus at the end of the pay period. No gift card, just an email notification and money credited to your account. Most motivating form of peer appreciation I've seen.
Basically a way to "tip" people for going out of their way to help you, except that the "tip" comes out of the company's pocket, not yours.
To prevent obvious abuse, you need to provide a rationale, the receiver's manager must approve and there's a limit to how many you can dish out per quarter.
Something designed to remove all intrinsic motivation from employees
Bonuses make a lot of sense in the financial sector, because the whole endeavor is about making money. Intrinsic motivation and making more money align. Historically it got introduced in order to mitigate cheating customers for personal gain. Also it helps that individual contributions are trivially quantifiable to a very large degree.
Obviously there are other professions that share some of these characteristics, like sales. Or if you narrow down a goal or task to "save us money".
> intrinsic motivation
Funny way to spell "unpaid extra work".
Or "How many MDB groups do I need to get approved to join over multiple days/weeks, before I can do the 30 second thing I need to do?"
Do not miss
“the difficult we do immediately. The impossible takes a little longer” WW2 US army engineer corp
>“the difficult we do immediately. The impossible takes a little longer”
This was posted in my front office when I started my company over 30 years ago.
It was a no-brainer, same thing I was doing for my employer beforehand. Experimentation.
By the author's distinction in the terminology, if you consider the complexity relative to the complications in something like Google technology, it is on a different scale compared to the absolute chaos relative to the mere remaining complexity when you apply it to natural science.
I learned how to do what I do directly from people who did it in World War II.
And that was when I was over 40 years younger, plus I'm not done yet. Still carrying the baton in the industrial environment where the institutions have a pseudo-military style hierarchy and bureaucracy. Which I'm very comfortable working around ;)
Well, the army is a massive mainstream corp.
There are always some things that corps don't handle very well, but generals don't always care, if they have overwhelming force to apply, lots of different kinds of objectives can be overcome.
Teamwork, planning, military-style discipline & chain-of-command/org-chart, strength in numbers, all elements which are hallmarks of effective armies over the centuries.
The engineers are an elite team among them. Traditionally like the technology arm, engaged to leverage the massive resources even more effectively.
The bigger the objective, the stronger these elements will be brought to bear.
Even in an unopposed maneuver, steam-rolling all easily recognized obstacles more and more effectively as they up the ante, at the same time bigger and bigger unscoped problems accumulate which are exactly the kind that can not be solved with teamwork and planning (since these are often completely forbidden). When there must be extreme individual ability far beyond that, and it must emanate from the top decision-maker or have "equivalent" access to the top individual decision-maker. IOW might as well not even be "in" the org chart since it's just a few individuals directly attached to the top square, nobody's working for further promotions or recognition beyond that point.
When military discipline in practice is simply not enough discipline, and not exactly the kind that's needed by a long shot.
That's why even in the military there are a few Navy Seals here and there, because sometimes there are serious problems that are the kind of impossible that a whole army cannot solve ;)
“and the easy... well, that’s not a good promo artifact, so never”
> My immediate reaction in my head was: "This is impossible". But then, a teammate said: "But we're Google, we should be able to manage it!".
"We can do it!" confidence can be mostly great. (Though you might have to allow for the possibility of failure.)
What I don't have a perfect rule for is how to avoid that twisting into arrogance and exceptionalism.
Like, "My theory is correct, so I can falsify this experiment."
Or "I have so much career potential, it's to everyone's advantage for me to cheat to advance."
Or "Of course we'll do the right thing with grabbing this unchecked power, since we're morally superior."
Or "We're better than those other people, and they should be exterminated."
Maybe part of the solution is to respect the power of will, effort, perseverance, processes, etc., but to be concerned when people don't also respect the power and truth of humility, and start thinking of individual/group selves as innately superior?
There is a certain amount of irony when the cookie policy agreement is buggy on a story about complicated & complex systems.
Clicking on "Only Necessary" causes the cookie policy agreement to reappear.
I dont see a cookie banner. Thankfully.
It didn't appear on DuckDuckGo either, Thanks.
Not for me, on Chrome now
I think there are two myths applicable here. Probably more.
One myth is that complex systems are inherently bad. Armed forces are incredibly complex. That's why it can take 10 or more rear echelon staff to support one fighting soldier. Supply chain logistics and materiel is complex. Middle ages wars stopped when gunpowder supplies ran out.
Another myth is that simple systems are always better and remain simple. They can be, yes. After all, DNA exists. But some beautiful things demand complexity built up from simple things. We still don't entirely understand how DNA and environment combine. Much is hidden in this simple system.
I do believe one programming language might be a rational simplification. If you exclude all the DSL which people implement to tune it.
> Middle ages wars stopped when gunpowder supplies ran out.
The arquebus is the first mass gunpowder weapon, and doesn't see large scale use until around the 1480s at the very, very tail end of the Middle Ages (the exact end date people use varies based on topic and region, but 1500 is a good, round date for the end).
In Medieval armies, your limiting factor is generally that food is being provided by ransacking the local area for food and that a decent portion of your army is made up of farmers who need to be back home in the harvest season. A highly competent army might be able to procure food without acting as a plague on all the local farmlands, but most Medieval states lacked sufficient state capacity to manage that (in Europe, essentially only the Byzantines could do that).
Following the definition from the article, armed forces seems like a complicated system, not a complex one. There is a structured, repeatable solution for armed forces. It does not exhibit the hallmark characteristics of complex systems listed in the article like emergent behaviors.
Agreed. The problem is not complexity. Every system must process a certain amount of information. And the systems complexity must be able to match that amount. The fundamental problem is about designing systems that can manage complexity, especially runaway complexity.
> Middle ages wars stopped when gunpowder supplies ran out
Ukraine would be conquered by russia rather quickly if russians weren't so hilariously incompetent in these complex tasks, and war logistics being the king of them. Remember that 64km queue of heavy machinery [1] just sitting still? This was 2022, and we talk about fuel and food, the basics of logistics support.
[1] https://en.wikipedia.org/wiki/Russian_Kyiv_convoy
The cookie banner reappears indefinitely on this website when I click 'only necessary' lol.
Thankfully I dont see a cookie banner at all. Did you try moving continents?
Sorry about that, I'm my newsletter provider (Substack) which is very buggy sometimes.
Probably because it is overly complex system.
by option or incompetence because serving text over http is very well abstracted nowadays.
if only there were some simple solution to host a static website without cookies and other garbage
https://cloud.google.com/storage/docs/hosting-static-website + pick your favorite OSS CMS
Mostly overlapping definition of what a 'complex system' is with :
https://en.wikipedia.org/wiki/Complex_system
although I understood the key part of a system being complex (as opposed to complicated) is having a large number of types of interaction. So a system with a large number of parts is not enough, those parts have to interact in a number of different ways for the system to exhibit emergent effects.
Something like that. I remember reading a lot of books about this kind of thing a while ago :)
I think you are using hysteresis when actually meaning more general path-dependency.
Except computers attempt to model mathematics in an ideal world.
Unless your problem comes from something side effects on a computer that can’t be modeled mathematically there is nothing technically stopping you from modeling the problem as mathematical problem then solving that problem via mathematics.
Like the output of the LLM can’t be modeled. We literally do not understand it. Are the problems faced by the SRE exactly the same? You give a system an input of B and you can’t predict the output of A mathematically? It doesn’t even have to be a single equation. A simulation can do it.
I think the vast majority of SRE problems are in the “side effects” category. But higher level than the hardware-level side effects of the computer that you might be imagining.
The core problem is building a high enough fidelity model to simulate enough of the real world to make the simulation actually useful. As soon as you have some system feedback loops, the complexity of building a useful model skyrockets.
Even in “pure” functions, the supporting infrastructure can be hard to simulate and critical in affecting the outputs.
Even doing something simple like adding two numbers requires an unimaginable amount of hidden complexity under the hood. It is almost impossible for these things to not have second-order effects and emergent behaviour under enough scale.
Can you give me an example of some problem that emerged that was absolutely unpredictable.
This is all exacerbated by a ton of the ML stack being in Python, for some god Forsaken reason.
Let's add a post scriptum:
Whatever you're working on, your project is not likely to be at Google's scale and very unlikely to be a "complex system".
Let's add a post post scriptum :)
Just because your project might not be at Google's scale doesn't mean it is therefore also not complex [^1]
Example: I'd say plenty of games fit the author's definition of "complex systems". Even the well-engineered ones (and even some which could fit on a floppy disc)
[1]: https://en.m.wikipedia.org/wiki/Affirming_the_consequent
Speaking of games, why hasn't google made a game. They could create a gaming division and well... make one. Amazon did. I wonder why they haven't.
Google has a really hard time grokking the games industry, to the point they can hire people from it and just almost totally ignore them. Their ideas on how Android game development should be done were utterly hilarious, and it's only because of a couple of their dev relations people going to ludicrous lengths that it is actually viable at all.
Fundamentally, and ironically, Google likes to offload complexity on to everyone else in their ecosystems, and they got so used to people being willing to jump through hoops to do this for search ads/SEO they are very confused when faced with a more competitive environment.
One reason Google can't make games is they can't conceive of a simple enough platform on which to design and develop one. It would be a far too adventurous constantly moving target of wildly different specifications, and they would insist you support all possible permutations of everything from the start. There are reasons people like targeting games consoles, as it lets you focus on the important bits first.
As someone who has firsthand experience:
A. The same reason Amazon had/has such a hard time.
B. Google lacking the same persistence of Amazon (Consider all the products that are killed)
C. Google's hiring process. (They organizationlly do not know how to hire specialists)
>B. Google lacking the same persistence of Amazon (Consider all the products that are killed)
Yah, like the Stadia, Google's streaming gaming console thing. They even had a first party game development division for it. So exactly what OP was wondering about.
Google made Ingress and Pokémon go (Niantic was part of google before it was spun off).
https://en.wikipedia.org/wiki/Niantic,_Inc.
/me pours one out for Stadia
IMO even a more interesting observation is that even Google itself doesn't necessarily work on large scale, e.g. many regionalised services in Google Cloud don't have _that_ many requests in each region, allowing for a much simpler architecture compared to behemoths like GMail or Maps
Don't underestimate my colleagues' abilities to turn the simple into the complex!
Managing complexity pays off sooner than one would think.
Even a project that's like 15k lines of code would benefit from a conscious effort to fight against complexity.
100%
IMO what we term "complex" tends to be that which the current setup/system struggles to deal with or manage. Relatively speaking google has much much higher complexity, but it doesnt matter as much, because even in simpler cases we are dealing with huge amount of variety and possible states, and the principles of managing that remain the same regardless of scale.
For small scale one can build a simple system but I see many are trying to copy FAANG architecture anyway. IMHO it’s a fallacy - people think that if they’ll would copy architecture used by google their company will be successful like google. I think it other was around - google has to build complex systems because it has many users.
Yes, it's called "cargo cult" and it applies to a lot of architecture and processes decisions in IT :)
It’s an infectious disease among developers. Some people would spend weeks making a simple landing page, and it would require at least 3 different cloud services.
Interesting, Thanks to the writer.
However, all this amazing stuff in the service of .. posting ads ?