TranslateProject/sources/talk/20190319 Hello World Marketing (or, How I Find Good, Boring Software).md
DarkSun 107add4762 选题: 20190319 Hello World Marketing (or, How I Find Good, Boring Software)
sources/talk/20190319 Hello World Marketing (or, How I Find Good, Boring Software).md
2019-07-09 17:25:13 +08:00

12 KiB
Raw Blame History

#: subject: (Hello World Marketing (or, How I Find Good, Boring Software)) #: via: (https://theartofmachinery.com/2019/03/19/hello_world_marketing.html) #: author: (Simon Arneaud https://theartofmachinery.com)

Hello World Marketing (or, How I Find Good, Boring Software)

Back in 2001 Joel Spolsky wrote his classic essay “Good Software Takes Ten Years. Get Used To it”. Nothing much has changed since then: software is still taking around a decade of development to get good, and the industry is still getting used to that fact. Unfortunately, the industry has investors who want to see hockey stick growth rates on software thats a year old or less. The result is an antipattern I like to call “Hello World Marketing”. Once you start to notice it, you see it everywhere, and its a huge red flag when choosing software tools.

Of course, by “Hello World”, Im referring to the programmers traditional first program: the one that just displays the message “Hello World”. The aim isnt to make a useful program; its to make a minimal starting point.

Hello World Marketing is about doing the same thing, but pretending that its useful. Youre supposed to be distracted into admiring how neatly a tool solves trivial problems, and forget about features youll need in real applications. HWM emphasises what can be done in the first five minutes, and downplays what you might need after several months. HWMed software is optimised for looking good in demos, and sounding exciting in blog posts and presentations.

For a good example, see Nemil Dalals great series of articles about the early marketing for MongoDB. Notice the heavy use of hackathons, and that a lot of the marketing was about how “SQL looks like COBOL”. Now, I can criticise SQL, too, but if SELECT and WHERE are serious problems for an application, there are already hundreds of solutions like SQLAlchemy and LINQ — solutions that dont compromise on more advanced features of traditional databases. On the other hand, if you were wondering about those advanced features, you could read vomity-worthy, hand-wavey pieces like “Living in the post-transactional database future”.

How I Find Good, Boring Software

Obviously, one way to avoid HWM is to stick to software thats much more than ten years old, and has a good reputation. But sometimes thats not possible because the tools for a problem only came out during the last decade. Also, sometimes newer tools really do bring new benefits.

However, its much harder to rely on reputation for newer software because “good reputation” often just means “popular”, which often just means “current fad”. Thankfully, theres a simple and effective trick to avoid being dazzled by hype: just look elsewhere. Instead of looking at the marketing for the core features, look at the things that are usually forgotten. Here are the kinds of things I look at:

Backups and Disaster Recovery

Backup support is both super important and regularly an afterthought.

The minimum viable product is full data dump/import functionality, but longer term its nice to have things like incremental backups. Some vendors will try to tell you to just copy the data files from disk, but this isnt guaranteed to give you a consistent snapshot if the software is running live.

Theres no point backing up data if you cant restore it, and restoration is the difficult part. Yet many people never test the restoration (until they actually need it). About five years ago I was working with a team that had started using a new, cutting-edge, big-data database. The database looked pretty good, but I suggested we do an end-to-end test of the backup support. We loaded a cluster with one of the multi-terabyte datasets we had, did a backup, wiped the data in the cluster and then tried to restore it. Turns out we were the first people to actually try to restore a dataset of that size — the backup “worked”, but the restoration caused the cluster to crash and burn. We filed a bug report with the original database developers and they fixed it.

Backup processes that work on small test datasets but fail on large production datasets is a recurring theme. I always recommend testing on production-sized datasets, and testing again as production data grows.

For batch jobs, a related concept is restartability. If youre copying large amounts of data from one place to another, and the job gets interrupted in the middle, what happens? Can you keep going from the middle? Alternatively, can you safely retry by starting from the beginning?

Configuration

A lot of HWMed software can only be configured using a GUI or web UI because thats whats obvious and looks good in demos and docs. For one thing, this usually means theres no good way to back up or restore the configuration. So if a team of people use a shared instance over a year or so, forget about trying to restore it if (or when) it breaks. Its also much more work to keep multiple deployments consistent (e.g., for dev, testing and prod environments) using separate GUIs. In practice, it just doesnt happen.

I prefer a well-commented config file for software I deploy, if nothing else because it can be checked into source control, and I know I can reproduce the deployment using nothing but whats checked into source control. If something is configured using a UI, I look for a config export/import function. Even then, that feature is often an afterthought, and often imcomplete, so its worth testing if its possible to deploy the software without ever needing to manually tweak something in the UI.

There seems to be a recent trend for software to be configured using a REST API instead. Honestly, this is the worst of both config files and GUI-based config, and most of the time people end up using hacky ways to put the config into a file instead.

Upgrades

Life would be much easier if everything were static; software upgrade support makes everything more complicated. Its also not usually shown in demos, so the first upgrade often ends the honeymoon with shiny, new software.

For HA distributed systems, youll need support for graceful shutdown and a certain amount of forward and backwards compatibility (because youll have multiple versions running during upgrades). Its a common mistake to forget about downgrade support.

Distributed systems are simpler when components have independent replicas that dont communicate with each other. Anything with clustering (or, worse, consensus algorithms) is often extra tricky to upgrade, and worth testing.

Things that support horizontal scaling dont necessarily support rescaling without downtime. This is especially true whenever sharding is involved because live resharding isnt trivial.

Heres a story from a certain popular container app platform. Demos showed how easy it was to launch an app on the platform, and then showed how easy it was to scale it to multiple replicas. What they didnt show was the upgrade process: When you pushed a new version of your app, the first thing the platform did was shut down all running instances of it. Then it would upload the code to a build server and start building it — meaning downtime for however long the build took, plus the time needed to roll out the new version (if it worked). This problem has been fixed in newer releases of the platform.

Security

Even if software has no built-in access control, all-or-nothing access control is easy to implement (e.g., using a reverse proxy with HTTP basic auth). The harder problem is fine-grained access control. Sometimes you dont care, but in some environments it makes a big difference to what features you can even use.

Some immature software has a quick-and-dirty implementation of user-based access control, typically with a GUI for user management. For everything except the core business tool, this isnt very useful. For human users, every project Ive worked on has either been with a small team that just shared a single username/password, or with a large team that wanted integration with OpenID Connect, or LDAP, or whatever centralised single-sign-on (SSO) system was used by the organisation. No one wants to manually manage credentials for every tool, every time someone joins or leaves. Similarly, credentials for applications or other non-human users are better generated using an automatable approach — like a config file or API.

Immature implementations of access control are often missing anything like user groups, but managing permissions at the user level is a time waster. Some SSO integrations only integrate users, not groups, which is a “so close yet so far” when it comes to avoiding permissions busywork.

Others

I talked about ignoring the hype, but theres one good signal you can get from the marketing: whether the software is branded as “enterprise” software. Enterprise software is normally bought by someone other than the end user, so its usually pleasant to buy but horrible to use. The only exceptions I know of are enterprise versions of normal consumer software, and enterprise software that the buyer will also have to use. Be warned: even if a company sells enterprise software alongside consumer software, theres no guarantee that theyre just different versions of the same product. Often theyll be developed by separate teams with different priorities.

A lot of the stuff in this post can be checked just by skimming through the documentation. If a tool stores data, but the documentation doesnt mention backups, there probably isnt any backup suppport. Even if there is and its just not documented, thats not exactly a good sign either. So, sure, documentation quality is worth evaluating by itself. On the other hand, sometimes the documentation is better than the product, so I never trust a tool until Ive actually tried it out.

When I first saw Python, I knew that it was a terrible programming language because of the way it used whitespace indentation. Yeah, that was stupid. Later on I learned that 1) the syntax wasnt a big deal, especially when Im already indenting C-like languages in the same way, and 2) a lot of practical problems can be solved just by gluing libraries together with a few dozen lines of Python, and that was really useful. We often have strong opinions about syntax that are just prejudice. Syntax can matter, but its less important than how the tool integrates with the rest of the system.

Weighing Pros and Cons

You never need to do deep analysis to detect the most overhyped products. Just check a few of these things and theyll fail spectacularly.

Even with software that looks solid, I still like to do more tests before entrusting a serious project with it. Thats not because Im looking for excuses to nitpick and use my favourite tool instead. New tools often really do bring new benefits. But its much better to understand the pros and cons of new software, and to use it because the pros outweigh the cons, not because of how slick the Hello World demo is.


via: https://theartofmachinery.com/2019/03/19/hello_world_marketing.html

作者:Simon Arneaud 选题:lujun9972 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出