Cookie monster versus "soft" paywalls

Pretty much everybody who's talking seriously these days about asking users to pay for news content is pointing at the same model: Leave the website open to casual visitors, but require heavy users to sign up as paying customers. Let people see perhaps half a dozen stories a month, but if they show signs of high interest, present them with a bill for the content they're consuming.

That's the model being planned at the New York Times. It's the model that Journalism Online has described as having the most support in its talks with newspaper publishers. (I don't know what model Rupert Murdoch is planning, but then I suspect he doesn't either.)

The goal of this model is to preserve most of the site traffic that enables an advertising revenue model to work, while getting serious users to support the journalism they find valuable.

But there's a technical problem: HTTP cookies. To let casual visitors in the door while challenging regular users to pay, you have to rely on cookies, and the cookie monster won't let you do that. Cookies just aren't very reliable for that purpose.

Lots of websites require that cookies work. You can't log in to buy a book, schedule a hotel room, post a comment or check your bank balance without cookies. They generally work fine for that purpose. So where's the problem?

Those sites use what are called session cookies. When you log in, the website hands your browser a token that uniquely identifies you. If you log out, or close your browser, or reboot your computer, that token is thrown away.

This is fine for a short session, but to track how many pages you've read this month on the Daily Bugle's website, we're going to need cookies that last at least a month.

Fortunately, the cookie standard supports long-lasting cookies.

Unfortunately, human beings throw them out.

All modern browsers can be configured to accept cookies but destroy all cookies at the end of a session, or on a schedule (perhaps monthly). This takes some action on the user's part, so most people don't do it. But many do.

In fact, a Comcore study found that 38% of users cleared out their browser cookies during the month of December 2006. And 7% of uses were "serial resetters," clearing their cookie stores four or more times a month.

This has a lot of disturbing effects on your traffic measurements, but let's stay focused on the paid-content issue.

Every one of those cookie-clearing visitors is going to knock a hole in your "soft" paywall, because your paywall can't know who they are after they've flushed the cookie jar. They're going to waltz right through that hole and read whatever they want. They will never see your "please pay for access to this website" request.

And it's going to get worse. Newer browsers have "anonymous browsing" modes that make this easier, and if you expect your users not to take advantage of such tools, you're fooling yourself.

I suspect that all of this will be greeted by "well, duh" by web geeks everywhere, but unfortunately most journalists and managers have no idea how these things work, so it needs to be said. You can't have your cake (a working ad model) and eat it too (a genuinely secure paid-access model).

You have to settle for a numbers game. Can you achieve a workable, "good enough" mix of free, paid, and "should have paid but didn't" access?

As you put together a spreadsheet to analyze this, be very careful with your assumptions.


Well DUH! ;-)

Some anti-virus programs warn you about cookies and want you to throw them out. Also Firefox very easily gets into a state where it wants to clear your cookies on every restart, and many users don't know how it got that way and how to stop it. Also it doesn't account for users who access the site from different browsers and machines. Also there would spring up a cottage industry of sites explaining how to get around the silly firewalls. Also it's anti-web, anti-democratic (why should only people with credit cards get news?), easily defeated and rather grasping-at-straws-ish.

You're right, there are some clever work arounds for a lot of these models. But it's a game of percentages, and maybe the rule of 85 applies here. If 15% of the people out there are killing off their cookies, then that might be within an acceptable range to make a metered paywall successful. After all, there are several well known work arounds to the Wall St. Journal's system, yet it is an extremely successful system even with those published exploits. There are ways around almost anything if you think about it enough (and want to be dishonest about it). Heck, I can go to the diner and read the paper for free every morning from one of the 3 stacked on the end of the counter. Or I can stick .75 in the big metal newspaper box outside and grab a copy for me and all my buddies before the door closes, too. I could walk over to my neighbor's box and stand there and read the obituaries in his paper every morning and put it back before he gets up because I'm not willing to subscribe just to read the obits, too. All easy workarounds to paying for a paper. But knowingly doing all these things is not great for your karma counter, you might be intentionally violating a business's model, and most importantly, some of them are just more effort than it's worth. If a newspaper can prove that you are purposefully clearing your cookies to reset your meter every day, and they can tie that activity back to you as a customer of an ISP, could that be considered theft of services? Maybe not for a lot of "innocent" reasons related to periodic clearing of our cookies, but the concept is not absurd. I don't know about you, every time I clear my cookies it's a real pain to get back in to all my banks' websites, all my credit card websites, etc. To be completely honest, clearing my cookies is more work than it's worth with my privacy - or to reset my meter. Lastly, there's some clever ways to increase the effectiveness of persistent sessions. HTTP cookies are not the only option for local storage. Some of the software behind these metered paywalls we're seeing spring up are incorporating those technologies. Don't forget, it's not the journalists and managers you speak of that are building the technology behind these systems, they are just building the model. It's a bunch of techs and geeks that are a bit more fluent in the craft of web development who are already thinking of this "duh" hole, and probably doing their best to plug it to get that 85% effective rating up to 95%.

Isn't it a different issue with the NYT site? Because it requires registration, they can surely log the number of articles each registered account has read. The issue would be the ease of creating multiple free accounts. Still, on any site a paywall puts you in constant battle against piracy. That enforcement cost doesn't get enough attention.

Anonymous is right: This is a numbers game, not a game of absolutes. If you're looking for an 85-15, you're not going to find it in an 62-38 environment. Too many people clear their cookies over the course of a month.

But you might find it workable if you narrow your time window to a week. Remember, you're not trying to build a bank. You're just trying to identify a number of people who might be good candidates to pay something. It's more like passing the collection plate at a church, but only to the people who come regularly. Tough to pull off, but maybe you learn to recognize a good number of them.

So is there a sweet spot in the numbers, and if so, what is it? Most journalists avoided math courses in college, but there are other disciplines inside newspaper companies (not just the techies) with analytic skills, and I would expect that the research people at the Times are going to be very busy over the coming months, trying to pull useful information out of the mountains of data in their server logs.

I know that analysis, testing and more analysis is a major focus of Journalism Online as well. Unlike the Times, they don't have just one shot at getting it right, so as smaller newspapers experiment you can expect to see several different pricing models and several different thresholds.

Jeff raises the issue of existing registrations. They're definitely an asset, but it's unclear how much of one.

The Times radically loosened up its registration requirements quite a while back (presumably chasing higher traffic numbers). I've had an account since the beginning, but I switch browsers and computers and operating systems a lot, so I'm typically hitting their site as an anonymous user. I haven't seen a login challenge for a long time. I would expect that its current user base includes a high number of casual users who have never registered.

In its announcement, the Times said print subscribers will not have to pay. This cuts the economic opportunity somewhat and adds a thick layer of expense -- technology to reconcile online accounts with print subscriptions, and customer service to cover for the fact that the technology (especially print circ) sucks.

At our company, we looked at tying online registration and print circulation systems several years ago, when registration was all the rage. In fact, we wrote a five-year strategic plan that said we'd do it. We did not. It's not as simple as it looks. The costs of mating up to an array of existing systems (and apparently no two are configured alike) wiped out any benefits we might imagine from a unified customer database.

Cookies could be tied to IP addresses. No single IP address could access a site more than n times unless they had a cookie installed. I have no idea how cable/dsl IP addresses are allocated in the US, but I'd imagine they're a rough constant and incentive to install said cookie. Saying that makes me feel I gave another avenue for a media company to deny me access to what they produce, an entirely bizarre concept at its core.

I predict the NYT's system will become an infrastructure for people who want to pay for and support the Times anyway. There will be loopholes, just as you can use Google to read The Wall Street Journal for free. Some people enjoy hacking the loopholes, while other people are willing to pay for the more comfortable feeling of browsing their favorite brand. After all, Apple turned the outlaw world of free music downloads into a profitable retail operation. The bigger picture is how Apple is turning the Google-searched open sprawling Internet into a collection of brand-oriented apps.

Put a Facebook Connect login on the NYT site, which makes it super-convenient to log in (the Washington Post is a great example of this), and it's not even an issue. I'm going to venture that the NYT has thought of this already.

@ anonymous 11:51pm wrote: "every time I clear my cookies it's a real pain to get back in to all my banks' websites, all my credit card websites, etc. To be completely honest, clearing my cookies is more work than it's worth with my privacy - or to reset my meter." Suggestion that many users are going to find it too much trouble to reset their cookies. This brought to mind a couple of thoughts. First, I use two browsers now. I use FireFox as my primary browser. It's where I do my work, and my work requires me to keep several tabs open on a persistent basis. I found that having tabs open for my social media sites got in the way, so I started using Safari as my social media browser and for logging onto Google services I use that have logins that don't match my primary Google login. It occurs to me that if I really wanted to get around a pay wall, I could use Chrome for that -- just dump all cookies at ever session, so I'm not messing up any of my other log ins. I'm sure that sort of thinking will occur to other people, especially the tech savvy of most ardent (for example) NYT readers. The Times might have a bigger problem then this than smaller community papers, in fact, given the difference in audience and motivation for the news. (Me, I only read the times when a link is recommended, so I'm not likely to ever pass the meter threshold anyway). Further, good point elsewhere about multiple devices and tracking those. The correct attitude toward pay walls might be is it's a question of loyalty -- convincing a big enough segment of your audience that what you have is worth paying for and that if they don't, they're not really being loyal, or supportive or ensuring the ongoing product they deserve. You worry about collecting from who you can and not lose sleep over the rest. Because any method of cheating the system is stealing, even if it is impossible to catch, and most people, even when they don't they can't get caught, are honest, especially if they see the potential victim of theft as something/someone they value. Impersonal Walmart has a much bigger problem with shoplifting than the ma/pop store down the street, because the customers of Walmart feel no personal connection to it, but the customers at the little store down the road are very loyal to it and its owners. The worrying about how people will get around pay walls (not that I'm advocating pay walls at all) reminds me of all the arguments against free site registration -- people will just lie about who they are or use BugMeNot to defeat registration. And real world experience showed those fears/predictions were completely unfounded. The vast, vast majority of people were honest with their local newspaper.

IP addresses fail for a couple of reasons, the big one being heavy reliance on single-address firewalls by large institutions (businesses, schools, government centers). News sites get a lot of their traffic from those sources. Even in a residential setting you're likely to have multiple users.

Facebook Connect is worth a look (we're evaluating it for completely different reasons) but it doesn't preserve the "casual visitors are unmolested" aspect of the model. Facebook does, like everyone else, rely on cookies to keep you logged in.

As I've said before it will be a lot of shake and bake with very little impact on the future of the company. It may be a marginal success, but the real challenge facing the company is not whether they can make 10% more by charging than by being free and serving more ads.

Jay Rosen points out that the Times is planning to allow free pageviews if they're referred by "another Web site." An interesting twist.

Flash shared objects are another place pageview info could be kept. It still can be deleted off the file system by visitors, but at least the delete function isn't built into browsers (and you'd have to know your way around the innards of Flash).