Rube Goldberg's CSV


At some invisible horizon-line future moment, marketing systems and practices of today will look as quaint, convoluted, and fragile as the mechanical systems immortalized by Rube Goldberg do to us. Goldberg’s whimsical machines — and more than a few real-world machines — amuse because they expend so much design and effort for such trivial ends. In our moment, though, marketers are so overwhelmed by systems and data that it’s difficult to remember what the marketing job-to-be-done is.

 “The aim of marketing is to know and understand the customer so well the product or service fits him and sells itself.” —Peter Drucker

If the digital mechanisms of a marketing operation — let’s say, an email department — were equally visible, would they look much different?

This comparison popped into mind across a number of discussions involving marketing, email, and data, and it’s the fragile and convoluted state of data-handling that I’ll parse in this post. In a nutshell, I question whether the Rube Goldberg state of data in most organizations is truly helping them succeed at the jobs of marketing.

Hype around data is rampant. “It’s data that makes the largest technology companies so valuable.” “Data i’s the new oil!” While we’re past “big data” — Gartner dropped that phrase into the ‘trough of disillusionment’ over a decade ago — there’s still a pervasive notion that this nebulous stuff we call data is quite valuable.

OK — maybe it is, or can be. But if you do hands-on work in the email space, can you tell me with a straight face that you haven’t touched a CSV file in over a year?

The issue with data in the email space is that it sucks. It’s a Rube Goldberg contraption. It starts with the early bird arriving to get the worm — also known as “please type your email address in this web form” — and causes all sorts of string-pulling and pistol-shooting, in hopes that the sponge will gain weight, or the customer record gain meaning. 

The nudge about CSVs is intentional; while it sounds like nit-picking, the fact that such a flat, dumb, and ancient standard is the go-to data interchange in the email (and marketing) space tells us, Houston, we have a problem.

CSV can rightly be called ‘dumb’ because there’s no data about the data in the data. If ‘Matthew’ is in the first-name column (position), it’s a first name. The poor overworked comma character holds the wall between columns; one missed comma and ‘Matthew’ becomes something else — last, company, whatever. It’s brittle; one tap (or tab) in the wrong spot, and the whole thing falls apart. Failed imports of N-thousand records is kind of normal.

The idea of putting data about the data in the data — metadata — isn’t new. Neither is the idea of structures richer than “really, really, really long columns.”  XML (1996) predates CSV, but XML-based import or export in marketing systems is rare. JSON is the go-to format for data in APIs — but JSON import or export in email systems is far less common than I would have expected. If available, it tends to be relatively rigid — along the lines of “you MUST use ‘first_name’ for first names.” Next vendor, next API - “you MUST use ‘First’”, and so on.

The terms import and export are another dimension of the data problem in marketing, particularly in email.

A recent Future of Email podcast episode featured Pierre Lipton (Forbes 30 under 30 winner) of 1440 Media, a “journalism reinvented” daily email news source focused on balanced and fact-based reporting. 

I discussed data challenges with Pierre in a follow-up conversation. While 1440’s commercial email platform is handling writing, editing, and sending reasonably well for now, his team has to run daily data import/export/re-import cycles to get the customer insights required for their business.

How much do you want to bet that CSVs are involved?

Import and export seem like reasonable things, but like CSVs, import/export is a kludgy, brittle approach. Point-in-time batch transfers create compounding problems — yesterday’s customer list ≠ today’s customer list. The more copies-of-copies, the worse it gets — and with crude, dumb transfer mechanisms like CSV, that coordination dance is brittle on top of brittle.

The gradual shift to APIs (REST, perhaps GraphQL) may look like a solution, but hands-on experience suggests otherwise. We’ve handled integration to dozens of ESPs at my company. A REST call is an improvement over CSV imports and batch files, but there’s still no overarching, commonly accepted data packaging approach. Vendor A wants {‘first’:’Matthew’}, vendor B wants {‘First_name’:’Matthew’}, so each REST integration is a unique one-off.

I’ve been to this goat rodeo before; more than a few years back, I helped found a non-profit which became the de facto technology standards org for the hospitality industry ( Hotel & hospitality businesses are a microcosmic digital mess. HTNG brought a vendor-neutral open industry standards process to bear, which eventually stuck.

For curiosity’s sake, here’s a screenshot from the HTNG Customer Profile Specification v3.1 ‘Create New Profile’ operation — sort of the hotel equivalent of a website email opt-in:

06 Dunn 2

Metadata standards like this look clunky and verbose, but this beats the socks off a flat-file CSV as an interchange standard. HTNG used a UK-based organization called The Open Group, who specialize in helping industries develop open standards. It takes a genuine investment from a wide range of players to develop standards, but the long-term payoffs can be tremendous.

Bringing HTNG into being took several years and a fair amount of politicking, and some key champions. I don’t know of a suitable industry body for this work in the email space.

Throwing hands in the air and saying “oh well” (again?) may be the only option. That’s a mix of ‘kick the can down the road’ and leaving sand in the gear-train. As privacy and data-use regulations expand, “dumb” will be less and less adequate, and I suspect we’ll continue to see people cycle out of the email space in part because honestly, who really wants to spend their day battling CSVs. It’s like going to work in a Rube Goldberg machine every day.

Looking forward to the OI Members-Only Live Zoom Discussion on this one – it’s June 16, 2022, from 12:00 Noon to 1:00 PM ET. Not an OI member? We have a few guest passes available – email to request one.

Related Posts