January 28th, 2010 | Google, Privacy | Dave Thompson

Google: Skynet? |
In honor of International Data Privacy Day, Google just released a list of five “Privacy Principles.” Google said it will implement the following ideals when creating new products and services:
- Use information to provide our users with valuable products and services.
- Develop products that reflect strong privacy standards and practices.
- Make the collection of personal information transparent.
- Give users meaningful choices to protect their privacy.
- Be a responsible steward of the information we hold.
These are important principles and they are a great start for a company that collects as much data as Google does.
But these five principles are focused on Google’s own use of data. It is a “Web 1.0″ model of privacy, where all of the concern is focused on how Google itself uses the data it collects. Call it a commitment to “Privacy 1.0.”
One important concept is missing entirely from Google’s list: social privacy.
We live in a Web 2.0 world. Data flows through Google in a million ways: through search, through Blogspot, through YouTube, and more. Even if Google promises to not use any of this data itself, thousands of other people can. A video of you hosted on YouTube and found through a Google search can have a far greater impact on your privacy than Google’s use of contextual advertising to serve you ads about suntan lotion when you search for “Bermuda.” Think about it: do you care more about contextual advertising, or a video of you that comes up for any Google search for your name? But Google’s privacy principles do not address this at all: they are entirely focused on Google.
In other words, even if Google promises that it will not misuse data, that does not mean that Google is respecting your privacy. Google is part of a larger privacy ecosystem. In fact, Google is perhaps the largest and most powerful part of the Internet’s privacy ecosystem. Google’s products (search, Blogger, YouTube, and more) connect more people to more information than any other company in history. It is crucial that Google recognize its role as the central connection in a massive data ecosystem. If Google creates a system that allows other people to violate your privacy, Google is complicit.
Take just a few examples that Google’s privacy principles do not even consider. Each of these has significant privacy implications:
- If the first result for a search for your name is a site with your home address and phone number
- If the first result for a search for your name was a site that displayed your medical history, HIV/AIDS status, sexual orientation, or other private information
- If the first result for a search for your name was a “hidden camera” video of you
- If someone else created a blog about you through Google’s BlogSpot service that listed everything you did every day
- If someone else posted a video of you on YouTube that contained false and defamatory lies
- If a health insurer uses Google to search for your name near “cancer”, “diabetes” and “overweight” before denying you coverage
- If an employer uses Google to search for what you are doing in your off-hours and finds that you are politically active in a way that disagrees with the boss
People can disagree about what Google’s obligation is to address each of those situations. But Google’s current privacy principles don’t admit that these are important questions, let alone address this social side of privacy. Call this new form of privacy, “Privacy 2.0“–the concern that your information will be misused by “300 million little brothers” rather than Orwell’s Big Brother. We’ve previously discussed the same principle as applied to Facebook: the concern is not that Facebook itself will violate your privacy, but rather that Facebook will empower other people to violate your privacy.
Google’s “privacy principles” are entirely focused on the old view of privacy, when the biggest fear was that Google itself would violate your privacy. It’s easy to protect your privacy from Google that way: just don’t use Google.
But in the Web 2.0 world, it is time for Google to accept that its privacy choices have impacts that go well beyond its corporate use of data. Google can create a system that allows users to protect their privacy from others. As the largest and most important information provider, Google has an obligation to at least consider these privacy implications. Its “privacy principles” don’t appear to even admit that its privacy practices affect a lot more than just its internal data use. It’s time for Google to catch up with Privacy 2.0.
January 22nd, 2010 | Facebook, Privacy | Dave Thompson
The good old days of paper records. Image courtesy Ed Uthman via CC license. |
It’s time for web companies to learn how to forget. It’s particularly time for Web 2.0 companies to learn how to forget.
The digital nature of the Internet makes it easy for websites to collect massive amounts of data: every click, every interaction, every search term, every referrer, every error… you get the idea. This massive data harvest can be dumped into a SQL database to be analyzed, cross-tabulated, summed, totaled, averaged, and dissected. In general, this is good. Web companies should learn from their visitors, and web companies should take advantage of the power of digital data collection. Important trends can be spotted, and products can be improved.
The problem comes when companies keep too much data too long. Take the example of a search engine. To a search engine, it is very useful to know what search terms are popular today: Google uses each day’s search terms to compile a list of the hottest search terms of the day, and undoubtedly uses the same data for anti-spam and quality control. So far, so good. Google is using its data in interesting ways for an appropriate amount of time.
The problem comes when data connecting search terms to individual users is kept too long. Six months from now, your search queries don’t matter. Maybe there’s some data that is useful in the aggregate (like the hottest search terms of the year, used to create Google Zeitgeist), but Google doesn’t need to know who entered each search query; the data has become stale and less valuable. Keeping non-aggregate data around too long is an invitation to privacy breaches, like what happened when AOL revealed thousands of search histories. AOL claimed that the data was anonymized, but it was possible to identify many individuals. Even more data can be revealed when web servers are hacked—Google claims that its servers were recently attacked in China, and it is not publicly known how much data was accessed. The more data that was still on Google’s servers, the more data could have been revealed. The same goes for insider theft, computers left unsecured, and any other means of getting at the data.
To put it simply, the cost-benefit tradeoff of keeping data changes as the data gets older. The benefit of keeping data decreases as it ages; data that has business value today (like clickstream data, search queries, and website interactions) loses value over time because it becomes too stale to use for business decisions. If long-term trends need to be spotted, then data can be aggregated and the original fine-grained data destroyed.
But the cost of keeping old data doesn’t decrease: to end-users, revealing old data can be just as harmful as revealing new data. A site that reveales embarrassing search queries from 2 years ago is just as dangerous as a site that reveals embarrassing search queries from last week. Here, Web 2.0 companies are particularly at risk. They know a ton about users’ social, political, and inner lives — information that is often very personal. They often know every interaction between two users — what profiles have you been clicking on? what messages have you been sending? who have you “poked” lately? were you on the Jersey Shore fanpage for an hour looking at pictures of Snooki? A site that collects this information is constantly at risk of losing it.
The solution is to destroy data, or at least take it offline and preferably move it into non-digital form. Search engines have recognized this in part, and have generally similar plans to destroy clickstream data within 6-18 months. But it’s not clear that a lot of Web 2.0 companies do. I know that many of my old Facebook interactions are still stored in a production database because I can still access them. There is simply no need for this data to still be in a production database that is vulnerable to hacking, data leaks, insider theft, and more. One data security incident could reveal the entire history of social interactions on the site. This is a privacy Sword of Damocles, silently hanging over every user’s head. What embarrassing thing have you done on Facebook in the last few years? What private messages have you sent? With one data dump, it could all be revealed.
Instead, Facebook could simply announce a policy to archive all interactions more than 12 months old, then move them offline. Or it could just delete them entirely: do we really need 5 years of history of “pokes”? Or, if users really want to keep their data, then let users download an archive with all their interactions and delete them from the server.
To be fair, forgetting is hard. Why don’t web companies forget more often? Often, it’s just inertia. It takes programmers’ energy to archive data, and it takes careful business decisions to determine when and how to archive data. Sometimes it’s like an overdue library book: you know that you need to return it, but you just never get around to it until it is very overdue.
Sometimes, the good old days are best. Remember paper files? Paper records are nothing like digital: they are slow to process, hard to store, and are corrupted over time. But maybe those are features rather than bugs.
In bullet points:
- Web companies collect massive amounts of data
- Clickstream, social interactions, emails and messages, credit cards and payment info, preferences, actions, and activities…
- It often seems easier to keep old data than delete it
- Disk space is nearly free, and databases make it easy to keep old records
- Programmers often think that old data will have some kind of marketing value
- Archiving is a pain
- But old personal data can be embarrassing or dangerous
- Information about people’s financial, social, and political beliefs can cause embarrassment
- Some data that seems benign (like your Netflix movie rentals) can reveal a lot more (like your sexual orientation)
- Some data that has identifying information removed can still be used to identify you (like your AOL search queries)
- Information about people’s names, addresses, and family can cause safety issues and encourage identity theft
- That said, information about places, things, and science should be more available
- News reports, scientific papers, and scientific data generally do not present the same privacy problems
- Old digital data is particularly likely to be problematic
- Data that is instantly accessible in a production database is instantly accessible to a hacker or data accident
- Insiders can leak data, intentionally or accidentally
- Once out, it can be digitally scanned, searched, sorted, and remixed
- Old data is less likely to be useful in a live environment
- There are solutions
- Move content into an archive that the user controls
- Delete marketing and clickstream data
- Research and trend data can be aggregated
- There’s something to be said for paper records. Paper records have a very high transaction cost; that can be a feature, not a bug.
January 13th, 2010 | Online Reputation Management | Dave Thompson

Michael Fertik talks privacy with Stanford Law students
Yesterday, CEO Michael Fertik was invited to present at Professor Jonathan Zittrain and Elizabeth Stark’s “Difficult Problems in Cyberlaw” class at Stanford Law School. Fertik joined a distinguished panel that included Stanford Law fellow Ryan Calo, Yahoo’s Director of Human Rights Ebele Okobi-Harris, and a contingent from the Mozilla Foundation including UI guru Aza Raskin and executive director Mark Surman. Other guests to the class included Lauren Gelman, Mozilla counsel Julie Martin, and a technology delegation from Japan.
The class engaged in a spirited discussion about the meaning of privacy on the Internet today, ranging from whether “privacy is dead” (in the wake of comments from Facebook CEO Mark Zuckerberg suggesting that he thinks users don’t value their privacy anymore), to whether conceptions of “privacy” online should include information disclosed by your friends through their blogs and social media.
If you were at the class, we welcome continued discussion.
January 11th, 2010 | Facebook, Social Networking | Dave Thompson
Facebook started as a closed system that protected users from the mistakes of their friends — it acted proactively to protect privacy.
But, now (according to Facebook CEO Mark Zuckerberg), Facebook cares about your privacy only as much as complete strangers do. He says that Facebook will simply follow “social norms” in deciding privacy policies. Is this a responsible position for somebody who controls one of the largest privacy ecosystems in the Web 2.0 world? Or should Facebook step up and serve as a privacy gatekeeper?
The backstory: Facebook started as a closed environment with thorough privacy controls. It was different from Google in that you were effectively in control about what information about you was made available — if you didn’t want to post anything and de-tagged yourself, then very little information about you would be visible. It was a welcome change that allowed people to feel comfortable expressing themselves without the whole world having to know.
Facebook changed all that with Newsfeed, which started to automatically inform groups of people that somebody uploaded new content. More recently, Facebook stirred headlines by changing its privacy controls and encouraging users to make all of their content visible to all other users (“A guide to Facebook’s new privacy settings“). There have been some unintended consequences of that decision (e.g., “Facebook loophole allows extensive data mining“, “Online exhibitionists undermine our right to live a quiet life“), but most analysis has focused on the impact on users’ own choices: are you fully aware of the consequences of choosing to make your photos visible to the world?
At the Crunchies awards, Facebook founder and CEO Mark Zuckerberg was interviewed by TechCrunch’s Mark Arrington (video here). When given the opportunity to talk about what Facebook was doing for privacy, Zuckerberg turned it over to users and said that Facebook would just reflect “social norms” for privacy:
Mike Arrington (TechCrunch): “Where is privacy on the web going over the next couple of years, do you think?”
Mark Zuckerberg (Facebook): “Well, it is interesting looking back. When we got started just in my dorm room at Harvard, the question a lot of people people asked was ‘why would I want to put any information on the Internet at all? Why would I want to have a website?’ Then in the last 5 or 6 six years blogging has taken off in a huge way and all these services that have people sharing more information. And people have gotten comfortable not only sharing more information and different kinds but more openly and with more people. That social norm has evolved over time. We view it as our role in the system to constantlybe innovating and updating what our system is to reflect what the current social norms are. A lot of companies would be trapped by convention and their legacy and the systems they built. Doing a privacy change for 350 million users is not the type of thing that [crosstalk] a lot of companies do. We view it as a really important thing toalways keep a beginner’s mind and think ‘what would we do if we were starting the company now and starting the site now?’ We decided that these would be the social norms now and just went for it.”
Facebook privacy settings |
Analysis: According to Zuckerberg, Facebook’s privacy stance reflects only “social norms” for privacy. In other words, your privacy is protected only if other people want it to be protected. And it sounds like Zuckerberg thinks people are pretty exhibitionistic these days.
There’s nothing wrong with empowering users to control their own privacy—some people want to be public and some people want to be private; Facebook appears to accomodate both types. Facebook should be commended for giving users the choice to make their own photos visible to the world, to just their friends, or to nobody at all.
But, what Zuckerberg seems to be missing is the fact that Facebook is about more than just your own photos; every action on Facebook affects a much larger privacy ecosystem. The dangerous part is that Mark Zuckerberg doesn’t even seem to know it.
The privacy ecosystem: Every action on Facebook affects a larger ecosystem. To take just one example, let’s say your friend Steve uploads a photo of you and tags you: if Steve has the new default privacy controls then Steve’s decision will affect his wall (showing the new photo), your wall (showing that you got tagged), Steve’s friends (their Newsfeed shows Steve’s upload), your friends (their Newsfeed shows that you got tagged), and your mutual friends (their Newsfeed shows both the tag and upload). If Steve followed the new default Facebook privacy settings then the photo of you will be visible to every single Facebook user: over 350 million people.
Even if you have your privacy set to the highest level (or have turned off photo tagging entirely), the photo is still automatically displayed according to Steve’s privacy settings—and because many of your friends are probably Steve’s friends too, they’ll still see the photo of you.
The result: Even if you try to control your own privacy on Facebook (by setting your privacy to the maximum) Facebook has now empowered other people to destroy your privacy for you. Facebook started as a closed, privacy-protective system. But Facebook has become Google: an open platform where anybody can see anything about anybody. It’s a new Wild West out there. And, thanks to facial recognition, it might not even matter that you de-tag every photo of yourself.
The subtleties of Facebook’s decision to encourage users to make more content available to more people makes the privacy problem much worse. Under the old system, if an acquaintance posts an embarrassing photo of you that was only visible to his friends, then maybe 100 people would see it (stat: the average Facebook user has 130 friends). That’s bad, but not the end of the world. But if your acquaintance makes that photos visible to friends-of-friends, then suddenly the photo is visible to an audience two orders of magnitude larger (theoretically, up to 17,000 people, but realistically more like 10,000 because of social overlaps). The same goes for any other embarrassing, false, outdated, scandalous, or private content: a revealing photo intended for an audience of 1, a drunken mistake, information about your sexual orientation or health status, etc.
In the old days of Facebook, the site respected privacy choices. Before Newsfeed, your own wall was the only effective way to communicate. New information was not automatically blasted out to hundreds (or even thousands) of people. But now, thanks to Newsfeed and Facebook’s push toward reduced privacy, if anybody uploads a photo of you, hundreds or thousands of people are instantly notified (your friends, the uploader’s friends, and the friends of anybody else who is tagged in the same photo). Even if you turn photo tagging off, many of your friends will still receive an automatic update–thanks to the inter-twined nature of social groups, you will probably have many friends in common with anybody who is uploading photos of you, writing about you in “notes”, or gossiping about you on their wall.
The dangerous catch is that you can’t solve the problem by leaving Facebook. Normally, when a company creates a privacy problem, there is a risk of a mass exodus. This risk is enough to get most companies to clean up their acts and protect their users. But leaving Facebook doesn’t solve the privacy problem—people will still upload photos and other content, which will be blast-distributed to hundreds of people.
To be clear, Facebook isn’t the end of the world and Zuckerberg isn’t evil. I use Facebook every day and Mark Zuckerberg is an outstanding entrepreneur. But, Facebook is still polluting the privacy ecosystem. Every day, Facebook takes more privacy control away from you and gives it to other people — your friends, acquaintances, and enemies.
Maybe reducing third-party privacy is a savvy business move for Facebook—paradoxically people have more power over their privacy after joining Facebook than before: they can detag themselves completely and be notified when new content about them is available. But not all profit-increasing decisions are good for the privacy ecosystem? Or should Facebook recognize that it has a leading (rather than following) role in choosing how privacy should be protected in a complex ecosystem? Should Facebook have a social responsibility to encourage privacy protection?
—
Do you run a startup or a small business? If so, Google drives your PR, consumer education, and public opinion. Take control over what Google says about you and your company with MyEdge Pro. Learn more now or call a MyEdge Pro consultant today.
November 2nd, 2009 | Online Reputation Management, Reputation Insurance, Search Engines | Dave Thompson
NEW JERSEY –
Levinson Axelrod is one of the oldest and most prestigious consumer injury law firms in New Jersey. It has been in business since 1939 and has won more than $250 million for its clients in the last 5 years alone.
But you wouldn’t know that by looking at the Google results for the firm’s name. If you search Google for “Levinson Axelrod” today, the fourth result is a site called “Levinson Axelrod Sucks” at levinsonaxelrod.net (intentionally not linked to avoid making it any more visible in a Google search).
|

(Click for full size)
One negative search result has undermined a law firm’s careful brand-building efforts.
|
The site is a gripe site built by Edward Heyburn, an employee fired in 2004. The firm says that Heyburn was fired because he was planning to open his own firm. Heyburn claims that he was caught in disagreements about the firm’s direction and feels that the firm is not living up to its image of the working man’s protector.
The site lists many alleged problems at the law firm. Heyburn’s site claims that some of the firm’s accomplishments (like “Super Lawyer” status) were actually paid advertising and not independent awards, and that the firm called him just to disrupt him when he was in the delivery room assisting his wife in labor. The site also highlights recent courtroom losses by Levinson Axelrod lawyers — something that can’t be found on the firm’s official site.
To a consumer law firm, image is everything
A firm like Levinson Axelrod depends on clients who are willing to trust the firm with claims that might be worth millions of dollars. When deciding on which lawyers to pick, clients depend almost entirely on the firm’s image and reputation.
After all, you can’t judge the skill of lawyers just by their names alone — instead, potential clients want to hear about experiences other people have had with a firm, and evaluate the firm’s reputation in the community. Law firms spend tends of thousands of dollars on branding and image maintenance just to make sure that customers get an impression of strength, skill, and experience in the courtroom.
|

By itself, even the most carefully-designed website cannot save a firm from brand damage through Google.
|
But all of that careful branding can be undermined by Google. Potential clients might look at the firm’s website to learn more. But informed consumers will also routinely search for the firm’s name in Google to find out what other people have had to say. They will look for other experiences — positive and negative — that are described on Google before entrusting any firm with a claim that could be worth millions.
Here, the “gripe site” lists many complaints about the law firm. Some of them might be true, or all of them might be false. But it’s certain that many consumers won’t take the time to sort out which allegations are true and false; there are plenty of personal injury lawyers in New Jersey, so why take a risk on Levinson Axelrod when there are other firms with pristine reputations? Maybe consumers should spend more time evaluating the gripe site’s complaints, but for most consumers, any complaint is enough to cause them to just click onto the next law firm’s site.
The same is true for doctors, lawyers, contractors, construction companies, and almost any other service industry: customers rely on reputation to make the decision to hire a service firm. And today, Google is the number one source of reputation — whatever Google says about you is what consumers will think about you.
SEO was not enough
Levinson Axelrod has spent a lot of energy on making its website come up first in search engines (“search engine optimization” or “SEO”). The firm name comes up first in searches for terms like “New Jersey Personal Injury Lawyer,” “NJ injury lawyers” and other targeted keywords. These are keywords that are worth $20.00 – $30.00 per click because they are very likely to lead to high-paying clients. One analytics tool estimates that this SEO is worth the same as $15,000 a month in online advertising — or $180,000 a year.
SEO is important, especially to firms that benefit from consumer traffic. But SEO didn’t stop one ex-employee from trashing the firm’s image. Any potential client that searches Google for “Levinson Axelrod” sees the “Levinson Axelrod Sucks” site right on the first page of results — and right below the firm’s official site. All that SEO value is undermined every time a potential client starts to do his or her due diligence on the firm, starting by searching for more information about the firm through Google.
SEO is designed only to get one or two pages to the top of a Google search; it is not designed to protect a brand or image online.
A lawsuit was not enough
The law firm has sued the ex-employee. It should come as no surprise that a lawyer saw this problem as a lawsuit waiting to happen.
But the lawsuit hasn’t solved their problem: the web page is still up while they wrangle it out in court. In fact, the legal conflict has actually drawn more attention to the website: a story of high-powered lawyers fighting each other is a guaranteed draw for reporters (see the Streisand Effect). And, of course, the ex-employee has every incentive to play up the media angle as much as possible.
Online monitoring and Google management: A better answer
If SEO and litigation aren’t enough, what is? Comprehensive monitoring and Google image management that will push truthful positive information to the top of search results for queries related to your company name or brand. If positive information fills all the top spots in a search, then false and disparaging information will never appear.
Proactive Google management now — before something goes wrong — can help keep false and negative information from ever reaching the front page of a Google search. Building positive, truthful content today can block many types of negative information from ever appearing on the first page of a search. It’s almost like a form of “Google-proofing” your brand — an investment today can stop damage in the future. And if false information never appears on the front page, then it won’t have power to undermine your brand or spark a media frenzy. In the case of an former employee’s complaint site, a hidden complaint site is like the ex-employee shouting into the wind: they might spend a lot of energy and vent a bit, but it is not likely to damage your brand.
Learn how you can use tools like MyEdgePro to protect your company’s image from attack today; visit http://www.reputation.com/myedgepro or contact Reputation.com for more information.