Big Data

Training tomorrow’s Marketers on Big Data

Big Data Boot Camp for Marketers

Big Data is rock’n the Marketer’s world. It is signalling a wake-up call that marketers need to be more metrics driven, more technically savvy and more process oriented. At the top of the food chain, CMOs are taking on responsibilities that traditionally belonged to CIOs. And at the middle management level, marketers are being required to be more technical and metrics oriented.

The days of just fishing for eyeballs or operating based on one’s gut instinct are long gone. It is no longer acceptable to just look at demographics or psychographics or just count eyeballs. Instead, marketers need to focus on the numbers — people’s tribes, their behaviors, their interests, their online behavior — both in terms of surfing the website or a mobile app or transacting with a page or shopping cart..

Most marketers would agree, however, that they are not prepared for the incoming Big Data wave: they lack resources, lack data know-how, and they don’t know how to get started.

According to a study from The Economist Intelligence Unit, only 24% of marketers use data for actionable marketing insight. Furthermore, in that same study almost 50% of marketers cited a lack of capacity to analyze big data. Some companies are increasing their budgets for Big Data analytics. The problem is that there’s no road map for getting these marketers up to speed.

Rather than focus on the bells and whistles (the technology) of big data, here’s are 7 steps a marketer a marketer can take to get out of their comfort zone and jump into the Big Data World:

  1. Understand the definition of Big Data, which is usually defined by the 3Vs:

    1. Volume or the amount of data involved

    2. Variety or to how the data is structured

    3. Velocity or the rate at which it is generated and analysed

  2. Subscribe to and learn from few key bloggers, who can teach you the ropes:

    1. SemAngel Blog by Gary Angel: Gary brings over twenty years of experience in decision support, CRM, and software development. Gary co-founded Semphonic and is the President and Chief Technology Officer.  But don’t let the CTO title fool you. Gary is the the brightest consultant I have worked with and can take complex techn issues and break them down into easily digestible and understandable. chunks for markets

    2. Analytics Blog by Justin Cutron: Justin is currently the Analytics Advocate at Google, so he has a boatload of knowledge. In his blog, he breaks down digital analytics for businesses.

    3. Customer Analytics blog by the SAS’ companies – This blog is for anyone who is looking for ways to improve the business of marketing and communicating with customers, which includes everything from multi-level marketing to social media campaigns.

    4. Big Data Hub by IBM: This blog is filled with case studies, videos, etc. from key players at IBM and beyond.

    5. Business Analytics Blog by Tim Elliot: Tom is an Innovation Evangelist for SAP. This blog contains his personal views, thoughts, and opinions on business analytics.

  3. Get your organization big data ready:

    1. Tear down your organization’s silos and engage multiple departments

    2. Give team members homework — tell them to read the blogs mentioned above.

    3. Think about how you will link your current data infrastructure to your project (that means a business analyst, and IT guy, etc. should be involved in the meeting)

    4. Know and recognize that Big Data is a team sport

  4. Work with  framework your organization agrees on, such as:

    1. Define Your Goal

    2. Understand your resources

    3. Review key segment’s Journey

    4. Confirm you are capturing data during each phase

    5. Establish benchmark

    6. Create a small measurable deliverable (test)

    7. Track over time

    8. Establish toll gate reviews

    9. Expand program

    10. Tweak your programs as needed

  5. Define the desired outcome and the one question you want to answer

    1. Yes, narrow it down to one (primary) question

    2. Answer the question and move on

  6. Understand your inputs by breaking down your customer(s) journey

    1. Identify the different sources of data, such as social network behavior, information from third party lists, mobile usage, downloads, etc.

  7. List out different types of potential metrics you could track:

    1. Information related specifically to the customers transactions (or actions)

    2. Information related to a segment’s usage patterns

    3. Information related to the overall marketing program

In some respects Big Data is just an extension of database marketing, a popular term in the 1980s and 1990s because it focuses on leveraging customer information to segment an audience and develop personalized campaigns. The biggest difference now is that we can leverage unstructured data (video for example) and implement just-in-time programs.

I am a big believer in learning by doing. If a Marketer really wants to be figure out how to integrate big data into their business processes, they need to have on-the-job training. (And to that point, I actually believe this is important for the CMO as well as the Business Analyst, although the latter might get more in the proverbial data weeds!). If marketers don’t do this, they will lose their admission ticket to be in the marketing world.


Future of Work: Interview with Anthony Goldbloom, Founder and CEO of

As someone whose career in the 21st Century has focused mainly on user contribution systems and user created content, I leverage several crowd-sourcing sites on the Web. One of my favorites is, which according to its Australian CEO, Anthony Goldbloom, whom I recently spoke to, enables people to outsource big data questions. Every predictive modeling problem is framed with a competition where the person who builds the most accurate model gives that model to the company and in exchange the company gives them a prize. Kaggle is a powerful way to build predictive modeling algorithms. Why is this important? Imagine a bank being able to predict who will default on a loan. (Note: Predictive Models are created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an email determining how likely that it is spam (definition from Wikipedia)

Andrew Goldbloom, CEO Kaggle

Goldbloom came up with the idea for Kaggle, while working at The Economist. He worked on an article on big data and data science, although as Anthony reminds me, ‘It wasn’t called that at the time”. While talking to CIOs who were struggling to get value from their data, he knew he could solve them and could “put up those problems (on the web)” and people could kind of prove their mettle by actually solving them.

During our discussion, Goldbloom mentioned two competitions:

  1. The William and Flora Hewlett foundation (Hewlett) reached out to Kaggle’s data scientists and machine learning specialists to develop an affordable solution for automated grading of student written essays. (Not sure my wife, who is a high school teacher will like this). The Hewlett foundation ended up collecting 24,000 graded essays written by high school students. In the end, a British hedge fund trader (trained as a physicist), a software developer at the national weather service and a German grad student created the winning solution, which can help schools assess students’ writing. The Foundation sponsored the contest and awarded $100,000 to the top three research teams. In the end, 250 teams participated and there were 2,500 submissions. (Note: None of the winners had a data science background).
  2. The Wikipedia Challenge focused on getting data-mining experts to build a model that predicts the number of edits an editor would make. Wikipedia wanted to understand what factors determine editing behavior. Contestants were expected to build a predictive model that can be reused by the Wikimedia Foundation to forecast long term trends in the number of edits that we can expect. There were 94 Teams with 115 players and 1024 entries. Here’s a page describing the challenge:

Kaggle combines many of the popular current trends in the industry: gamification, crowdsourcing, virtual workforce, and, of course, Big Data. (Venture Capitalists must love this company).

Companies can build models in house or hire a consulting firm like Accenture. Kaggle’s crowdsourcing solution is a new third option. As Goldbloom points out, “Companies are beginning to see Kaggle as a leveraged arm of their own business.” How does it work? Companies and researchers post their data. Statisticians and data miners from all over the world compete to produce the best models. Companies identify a problem and then leverage Kaggle’s active community to solve it. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modeling task, and it is impossible to know at the outset which technique or analyst will be most effective.

Kaggle’s secret sauce is that there’s lots and lots of data out there, and a strong desire to play with this data.

In particular, Kaggle is gaining the most traction in financial services, in the technology sector, and in life sciences. Competitions filter talent and also let the best data solutions float to the top of the pack while people are giving objective feedback along the way.

As Goldbloom points out “The really nice thing about these predictive modeling tasks is you can back test people’s algorithms on historical data and get a sense for which algorithms perform well and which algorithms don’t perform so well.”

Most of the 45,o00 members on Kaggle call themselves data scientists, which is one of the hottest professions in Silicon Valley. Most of them, however, have an engineer or computer science degrees.  Here’s a breakdown of their professions:

















Kaggle has several public offerings:

Kaggle Prospect (in beta now), which Practice Fusion (another favorite company of mine), a vendor of electronic records, used by opting up their data to determine what types of problems could be solved, such as predicting who will develop diabetes.

Kaggle In-Class is another product, predicting the past or the future requires students to build models
that are evaluated against past outcomes. For example, an instructor might host a predicting-the-past competition that requires students to build models to predict wine prices based on country of origin, vintage, and other factors. The winning model would then be that which most accurately predicts actual prices from a set of historical price outcomes (hidden from the students).

Kaggle has a great business model, one that should be considered by other crowdsourcing companies. As Goldbloom explains:

“Competitions are open to everybody. The sole purpose of these competitions is to qualify talent. So you if you finish in the top ten percent of two public competitions, we’ll label you as qualified talent.” Most of Kaggle’s commercial work, such as banks trying to predict who’s going to default on a loan is conducted via a private competition. “For private competitions we basically invite 15 of our strongest members. Each of them compete behind the scenes and the prize money is consistently – it’s a six figure sum and we also take a large fee on those private competitions.”

The private competitions require large data sets, and an invitation only crowd-sourcing process, both of which are kept private. All the participants received some sort of monetary reward.

Here are some examples on potential ROI vs. Realized ROI.

Transactional Fraud: A large credit card issuer.

Assuming the issuers has 50MM credit cards with their customers spending on average $500 per month. Based on current industry estimates, let’s assume the issuer experiences 10 basis points (1 basis point is 1/100th of 1%) in current fraud losses, will put total fraud losses per year in the neighborhood at $300MM / year (50MM * 500 * 12 * 10 basis points). Just a mere 5% reduction in fraud losses with a better model will generate an incremental return of $15MM / year.  This can easily put the ROI in the double digits, especially when you can think about much time and how many people you would need to resolve these issues.

Retail consumer marketing: A large retailer

A big box retailer, with over 20MM customers, sends product promotions to their customers on a monthly basis. Typically the number of customers who respond to these offers is less than 1%. Assuming, each customer spends $200 on average because of the marketing offer, the retailer probably sees $40MM (20MM * 1% * 200) in incremental sales. A better predictive model through Kaggle can easily double or triple the response rates to these marketing offers, there by leading to $80MM to $120MM in incremental sales!

Goldbloom’s team’s grand vision is to create a Meritocracy, a labor market where the best people rise to the top, both in perms of skill and value.” (Meritocratic is a system where appointments and responsibilities are objectively assigned to individuals based upon their “merits,” namely intelligence, credentials, and education)

Goldbloom provides an example: “Roger Federer is ranked number one in the world because he wins more tennis matches than other tennis players. I would very much like to see us create the world’s first meritocratic valuable labor market. So, you know, I mount the argument that, Roger Federer is a phenomenal athlete, but he doesn’t generate, you know, a lot of value.” (most people in the audience, for example)

I highly recommend that you check out!

ROI (Real Overall Impact!)

  1. Use a public area to identify potential leaders to participate in a private area
  2. Leverage a real time leaderboard which motivates people
  3. Enable the community to determine the content – what problem will be resolved.
  4. Check out Hacker News for a good implementation of the Thumbs up / Thumbs down process
  5. The platform for uniting free agents is important.
  6. People learn more by doing vs. sitting in a class or reading a user manual

Transcript of Interview:

What is Anthony reading?

Interview was recorded on June 20th, 2012 and written up in Boston at the Trident Bookstore in Boston, while watching another competition: Women’s Gymnastics at the London Olympics!

Thank you Nation!


Mind the Gap

Once a month, I take a peak in my ‘to read’ folder in my Google Drive and catch up on some reading.

One almost-forgotten article written by highlighted some research showing big measurement gap between ‘what’s important to management’ and ‘what can actually be measured.‘ (see chart way below). According to the research, marketers seem to focus on data that is least important to management; they tend to focus on likes, clicks, downloads, etc. Unfortunately only a small percentage are starting to focus on key areas such as customer lifetime value:

In an earlier interview here, Gary Angel, CEO of Semphonic highlighted this points:

They are (beginning to) look at customer segmentation and lifetime value, and building predictive models that help you understand which customers might attrite or which are the best candidates for retention- models and analysis that really help you understand which of your operational and marketing efforts drive incremental lift and change customer behavior.

Financial institutions, airlines and others with affinity type of programs have been some of the few industries to understand their various customers from a financial perspective. When I worked at American Express back in the late 1900s (1989-1992), all members were placed in deciles (In descriptive statistics, any of the nine values that divide the sorted data into ten equal parts, so that each part represents 1/10 of the sample or population). Companies, especially those on the web or on mobile platforms, need to start approaching their customer base in this manner so they can understand their most valuable customers (not always the ones spending the most), their least valuable customers, and those with high probability to move up to one of these important segments. 

Some important items to consider when looking at the value of a customer:

  • Determine how much it cost to engage with them and drive them to a transaction
  • Break this information down by channel (Google, social network, email, etc.)
  • Subtract your costs (decide if you want to make these costs fully-loaded included the costs of employees)
  • Ensure that you can track each individual, their original channel, etc. over time

Tracking true ROI and lifetime value requires a real metamorphic change in some organizations. It requires a lot of data crunching, strong analytic skills (something that is in high demand), and is intellectually challenging. As Gary Angel points out it’s a challenge “to balance the long-term impact on retention with a short-term monetization opportunity around display than to simply “optimize” your revenue, that the two tasks can hardly be compared.”

The research also touched upon the discrepancy between what is being tracked and what managements wants to track were highlighted in the reports, such as brand awarenes: 78% of marketers said it is important to executive leadership, but just 32% of them feel they can actually assess this. This is nothing new. For decades, brand (only) marketers have fought to prove their value because so much of awareness advertising is untrackable.

Today, though, marketers finally realize that building brand encompasses a great deal more than a nice logo or tag line. The complete customer experience impacts the awareness and impact of a brand. (See my Hugh Dubberly interview). 

One area marketers seem to be doing a good job is in driving traffic to the site. The problem, however, is that driving traffic to the site is kind of an older paradigm (unless you want to own all the transactions). I would recommend to ‘fish where the fish are,’ and conduct your marketing efforts and engage with customers where they spend their time.

NOTE: customer base was used for this research. It’s site states that it’s user base consists of 449,000 entrepreneurs, small-business owners, and professional marketers at the world’s largest corporations. This leads me to believe that not many executives were included their research. Most of these people (at least ones I work with and some of my big data research has shown) believe management can not clearly articulate its KPI’s for success. Lets just say there’s a healthy tension between the two groups. 

It’s important to really (gently) force management to clearly articulate its quantitative criteria for success, and if you don’t have the means to get to those numbers yourself, then seek outside assistance. I can recommend some firms, if you would like (and not promote myself).  


Webinar Content: The Big Data Challenge

Gary Angel, Marshall Sponder and I conducted this webinar on 6/26.

Challenging the Analytics Community: Big Data

  • Choosing the right Technology Stack
  • Adapting your Analytics Methods
  • Creating the necessary organizational synergies


The big data frontier isn’t really about crossing a threshold in the amount of data you use. It’s more about how you work with the data you have. Organizations like Google and Facebook have never had any choice but to work directly with their own data. For most enterprises, however, SaaS Web analytics solutions provided an easy way for organizations to manage and use their digital data. Only companies like Omniture and Webtrends had to do the heavy lifting with the data. That’s changing. More and more enterprises are deciding that the benefits of working directly with their data outweigh the costs and the challenges. But what exactly are those challenges? Technology, methodology and organization are all critical. Assessing the right technology stack for your data volumes, integration requirements and analysis needs is complex and involves significant trade-offs between performance, robustness, price, and total cost of ownership. Big data solutions fundamentally change the way you need to think about using your data. If you don’t re-think your approach to the data, you won’t reap significant benefits no matter which technology you choose. Nor are organizational challenges to be dismissed. Big data involves a collaboration of IT, measurement and marketing at a deeper level than traditional BI. Unless you create the right organizational synergies, you’re effort will likely fail.

In this webinar, Gary Angel, Marshall Sponder and Scott Wilder will tackle the key big data challenges you’ll face as you move toward this brave new world of analytics.

 Video recording:


Powerpoint version:

Back by popular demand: Webinar on Big Data

In just a few weeks — on June 26th at 10 am PDT, the three amigos will ride again and do a webinar on Three Main Challenges for leveraging Big Data:

  • Choosing the right Technology
  • Adapting your Analytics method
  • Creating the necessary organizational synergies

Register here:

The big data frontier isn’t really about crossing a threshold in the amount of data you use. It’s more about how you work with the data you have. Organizations like Google and Facebook have never had any choice but to work directly with their own data. For most enterprises, however, SaaS Web analytics solutions provided an easy way for organizations to manage and use their digital data. Only companies like Omniture and Webtrends had to do the heavy lifting with the data.

I will lead the discussion with my two amigos:

We have done about five webinars together and for each one, we answer every question. Either during or right after. Some come join us on June 26th at 10 am PDT / 1 PM EST.

Future of Work: Interview with Gary Angel, Founder and CEO of Semphonic, Inc.

This week’s guest was Gary Angel, Founder and CEO of Semphonic, Inc., a leader in web measurement and analytics. Gary and I first met back in 2005, when I was managing Intuit’s Online Community. At the time, I wanted to get beyond clicks, page views and links (like Lions, Tigers and Bears, oh my) and identify new and innovative ways to think about analytics in the ecommerce and social web. After meeting with many of web analytics consultants, I finally was schooled (in a good way) by Gary. He is not only one of the leading innovators and practitioners in this space, but also is one of the most effective business consultants I have worked with.

Gary started his career as working as programmer in the finance industry, which has proved invaluable in growing his analytics business. Financial institutions tend to be light years ahead of most companies when it comes to looking at customer behavior. I know this firsthand; Twenty plus years ago, I worked at American Express (AMEX), and since that time, I have worked at a number of Fortune 100 companies, and none of them have been as sophisticated as AMEX when it comes to customer data. Maybe that’s one of the reasons, Gary and I like getting together and discussing customer behavior and data. We both received our initial training at the American Express’ of the world. (By the way, we do a monthly webinar series on the topic and you can find some of our previous presentations here).

Gary leveraged his early programming experience in the financial industry: “there’s a lot of similarities (with credit cards and web behavior) that people over on the websites are showing you their interests by the way they navigate and the way they move through your pages and what they look at and what they buy and how they spend their time.” Interestingly, some of the data and predictive modeling did not carry over so well into the web space.

At Semphonic, Gary has witnessed first-hand the evolution of web analytics. He points out that in the early 2000s, people were quite skeptical of web analytics and they didn’t know how to get beyond how many page views or clicks they had. Neither of these items is very useful for building customer profiles and segments.

Since so much of Semphonic’s early work focused on how users navigated a website, Semphonic developed an approach called Functionalism, which basically breaks up a web site into its constituent pieces and then assigns one or more specific functions to each piece. These functions can be things like navigation (e.g. route visitors to a specific place), motivation (e.g. convince a user to do something) or information (e.g. provide a visitor with some piece of information). Based on the functions of the page, it is assigned a particular page type from a set of common templates that they’ve distinguished over time in the measurement of different types of sites.

As the graphic below indicates, you can build expected (and desired) use cases around people’s behavior, and then track their success rates. Or you can look at their ‘Say-Do’ ratio in your research. You can ask them how they use a page or a website and then track their behavior online to see if those two things are consistent.

The colors redgreen and yellow indicate whether or not they completed the desired task. And if they didn’t, that would be considered ‘an outage’. Or an unsuccessful task. Searching on a word, getting a search results page and leaving the site, for example, would be considered an outage. This is really important when determining how to spend your resources and to fine-tune how you lead people to a shopping cart or a desired transaction, such as downloading a white paper.

Semphonic’s evolution mirrors the maturation of web analytics. As Gary explains

“Over time we’ve become a full-service digital measurement consultancy where we work across the whole spectrum of problems around web analytics, and also around new channels like mobile and social. But we’re still in almost every case focused on really doing analysis for people because I think it’s at the analysis level that you actually start to see results from data. It’s where you actually start to take the data and make recommendations about how to improve your business; how to do better from a marketing or an operations perspective. As a company that’s what we’ve always been focused on and I think all of the other stuff is just getting you to the point where you can really do that. “

Gary calls this classic web analytics, but sees companies finally moving into a new and exciting area.

“They are (beginning to) look at customer segmentation and lifetime value, and building predictive models that help you understand which customers might attrite or which are the best candidates for retention- models and analysis that really help you understand which of your operational and marketing efforts drive incremental lift and change customer behavior.”

This is the result of ‘the maturity of the web analytics’ market and the explosion of online data on individual users. The irony in all of this is that classic direct marketers and cataloguers have been doing this type of database marketing for years. I guess this is an example of history repeating itself. Another reason for focusing more on behavioral data is that companies are relying less on third parties to house their data. This change has created a gap (or a need) in the number of data analysts and data experts companies need. A big challenge moving forward will be to find, hire, train and retain these folks. In some ways, they will achieve the rock star status that many computer engineers currently have. (Maybe I should go back to my roots and become a CDO, the Chief Data Officer in a company). Companies need to be more “audience” focused and less “traffic” focused. The current approach focuses more on campaigns, but it doesn’t give you any sense of whether the end result was a profitable customer or a customer that’s costing you money. Companies should be focusing far more on the LTV of a customer!

Besides focusing more on customer behavior, Gary believes organizations need to become less siloed. As we’ve discussed here before, different divisions or different channels (customer service and marketing) have different success metrics, different tools and rarely share learnings across an organization. I think this is a big opportunity; companies need to give customer service a seat at the product development table.

When asked about mobile and smart phone platforms, Gary stated the fixed web (what we think of as regular websites) and mobile currently are tracked in similar ways. The opportunity, he believes, is to treat mobile analytics differently, especially when it comes to mobile apps: how are people using the app, how does it fit into a broader customer journey, etc.. “You know, one of the challenges to measuring mobile apps is they do not look like websites. So it’s not page-by-page situation. That’s a paradigm that does not work very well when it comes to smartphones and other devices. If you try to fit your mobile app into a page based paradigm, you’ll find, I think, that your measurement isn’t very interesting.”

Big Data: No analytics discussion these days happens without touching upon big data. I think there is a technology barrier that companies have to cross. It’s very difficult to do customer level analytics within the web analytics solutions that are on the market. One interesting reason for this phenomena is that companies realize they can control their destiny more, relying less on third parties to manage and manipulate their data. This has created new titles, such as Data Scientists, hot new areas, such as business intelligence and a gap of well qualified analytics practitioners. As data management has moved in house, IT has gotten more involved and closer to marketing. IT can help determine what sort of technologies are needed for the amount of data a company has. For example, it might not make sense to build the Stealth Bomber warehouse, when a simple prop plane infrastructure might suffice (My analogy).

Gary emphasized the importance of growing important of analyzing qualitative information verses just looking at the numbers when transitioning to customer analytics. He believes “the biggest opportunity at the enterprise level right now is to really consistently and aggressively exploit attitudinal, social, textual data that’s collected. Most enterprises we work with collect vast amounts of data at the call center level. They collect vast amounts of social media data. They collect opinion lab data on their sites. They do attitudinal surveys, both on and off-line. But all those efforts tend to be completely siloed. And not only are they siloed, they’re non-standardized and the distribution of the information is poor. And so what we find is that every research effort tends to be a one-off. And that’s really ineffective.” This is a key organizational challenge. How to get different parts of the organization to share data on a consistent basis and to establish a common language and approach.

Gary continues “creating a set of standards for how we talk about and think about those customers, making sure that the data quality is consistent, that we categorize customers the same way; all the things that have become standard practice around structured behavioral data are critical to using attitudinal information effectively. But none of those things are done in the world of attitudinal and unstructured data. I think there’s a tremendous opportunity for competitive advantage to organizations who are willing to put the effort into effectively build a voice of customer warehouse; to consolidate all that information; to put standards around it; to make sure that they’re doing it on a consistent basis and to distribute it out to everyone in the organization in a consistent fashion. I don’t see many of our clients doing any of those things and I think there’s tremendous benefit to a voice-of-customer warehouse relative to cost; probably more than anything we’re doing on the behavioral side. “

I love the term “voice of the customer warehouse.” It might not be easy to say, but it is definitely something that most organizations need. Unfortunately, when it comes to looking at what customers are actually saying, companies usually just focus on word frequency or sentiment analysis. We both believe it is important to understand the use of language – how people describe things. Gary continues “it’s much more understandable and actionable when I can understand whether the sentiment is related to issues around customer support or issues around product functionality or issues around my advertising campaign. Those are three fundamentally different things, and frankly, to understand overall that my brand sentiment on Twitter is going up or down doesn’t really tell me very much because Twitter’s not a representative sample. And so I have to be able to contextualize it relative to what that sentiment’s about before I can actually understand what it means and what to do about it.”

When asked about exciting trends, Gary discussed some of the work the Financial Times is doing when integrating their mobile data with their web data. Most companies, as we discussed earlier, tend to keep these two areas separate. One of the last topics we discussed was what industries have done a good job in web analytics. Not surprisingly, he highlight the success traditional direct marketing (catalog) companies have had vs. other companies that have historically relied more on a direct sales force or a unique product. In Gary’s opinion, those companies have a culture pre-built to focus on and consume analytics.

Since Gary touched up on many different areas, I will use my next post to discuss the Web and Mobile analytics impact on the organization.

Insights shared:

  • Functionalism: An approach for understanding issues/outages with your website
  • Focus on Lifetime value (even with websites and mobile platforms)
  • More companies are managing their own analytics software vs. relying on a vendor
  • Organizations starting to move data in house and are looking at big data solutions—so IT needs to get more involved
  • Companies face real organizational issues because their analytics / data initiatives are very silo’
  • Think about combining, for example, your call center learnings/verbatim with the website info
  • Focus less site metrics on traffic, conversion rates and need to rethink the approach and make it more audience focus
  • Go beyond just being attribution focused (more than just how each channel is doing)
  • Don’t think of mobile analytics in the same way as mobile analytics — there are differences
  • But you can use similar approaches fro mobile and web – tagging, reporting
  • But differences too: resulting from types of websites, screen size and device dependences.. and mobile apps are different
  • Opportunity to measure web apps and mobile apps better
  • When it comes to behavior analysis, consider building a (big data) Voice-of-the-customer warehouse
Other Info:

Big Data and Privacy


This weekend I had some down time and decided to read The Daily You by Josephy Turow, dean of Graduate Studies at the Annenberg Communications School at University of Pennsylyania.

And all I can say is that this book is a must read for anyone working in the digital space, especially advertisers. The book starts out with a nice history of web advertising and then goes on to discuss today’s customized advertising, discounts, news and entertainment , all of which are being tailored by newly powerful media agencies on the basis of data we don’t necessarily know they are collecting and individualized profiles we don’t know we have. Advertisers are placing individuals into what the author calls “reputation silos.” (These are really different psychographic type of segments)

For example, you might be categorized as a Caucasian living in New York City who only eats organic foods and watches Mad Men every week. Is that such a bad thing? It depends on what types of ads and offers are being served up to you based on this information.

The main message of the book is that although we love cool new web based technologies and platforms (Facebook, etc.), the consumer runs the risk of limiting our privacy and anonymity to advertisers.

Reading this book reminded me of my days at AOL, when I worked on their first commercial Internet properties, GNN and WebCrawler, creating advertising inventory. One day back in 1995 stands out for me. It was when Proctor and Gamble, the largest media buyer, wanted to advertise on several of our properties. My co-workers and I spent the rest of the month running around like chickens without our heads making sure everything went perfectly for P&G. It was a simple reminder that the advertiser rules when revenue is involved.

According to the author, we are just at the very beginning of an advertising or consumer behavior tracking revolution as advertisers aim to integrate consumer information across multiple platforms (the web, mobile, and TV). This is Holy Grail for marketers. Companies like Google will also use this information to serve up personalized search results, not just ads. Ironically, when people were asked how they’d feel if a search engine tracked what they searched for, 65% said it was a bad thing. 73% overall said they were “Not OK” with personalized search, since they felt it was an invasion of their privacy.

Although Turow doesn’t touch upon Facebook’s and Google’s recent and ever-changing advertising privacy policies in the book, he does provide some good commentary on this topic in a recent interview he did on NPR’s Fresh Air with Terry Gross.

So far, The Daily You has not gotten the press it deserves. So, take a chance, buy it and read about where media and advertising are going. And for those folks who are media buyers or work with major advertisers, it is important read because of it will provide some valuable insights into customers’ and viewers’ privacy concerns.

Companies can get more informed and responsible by becoming members of the Network Advertising Initiative (“NAI”) and adhering to the Digital Advertising Alliance’s Self-Regulatory Principles for Online Behavioral Advertising. If you’re an online user, you can find out more about online behavioral advertising and learn what choices you have and how to use browser controls and other measures to enhance your privacy.

Since online advertising is becoming more and more complex, what do you think both publishers and advertisers should do in the face of the increasing discussion about consumers’ privacy?

Analytics: Key part of Social Business Center of Excellence

 So you want to build out a robust social media analytics program for your company, eh?

This process should be very similar to the approach you took in building out your digital analytics program. Follow the same trail to the summit.

Like any good journey, you need to make sure to focus on the basics first, such as:

    1. Getting internal alignment from you key stakeholders on your business objective (Hopefully, one objective!)
    2. Obtaining sign-off on the key metrics you want to look at
    3. Understanding your organizational constraints and resources
    4. Identifying and setting up the right tools/technology

But before launching a program, there are some important steps along the way that you should seriously consider:

    1. Work closely with your IT group because they usually set the standards for bringing technology into an enterprise environment
    2. Work closely and meet often with your financial partner (usually there is a finance guy assigned to your team) to show them that you are working on driving the business forward, that you understand what you are doing.
    3. Establish a baseline to measure from and know that every so often you might have to ‘move the goal line’ of desired results as well as the original baseline because your growth my skewed in the early stages of the program
    4. Incorporate Share of Voice vis a vis your direct competitors, your indirect competitors (if you are selling financial software to small businesses, excel can still be viewed as a competitor)
    5. Understand that there can be multiple ROIs for the whole organization since different groups have different objectives in using social media.
    6. Know that if you have an international focus, the same tools might not always work as the ones you use domestically
    7. Build in a mobile component to your social media analytics because as we all know, it is here to stay.

Most of the above applies to an enterprise type or Fortune 100 company. Ideally, the individuals working on measuring your success would be part of a Center of Excellence. Note, however, that this is more than the hub-spoke model, where your social media team resides in the middle with representatives from multiple groups.

One of the challenges with this model is that the groups representing the spokes are not funding a full time or part time person to look at social media, but rather having someone ‘just attend the meetings.’ Secondly, the Hub, the social media team tends to still be influenced by where they sit in the organization. If they sit with the public relations team or corporate communications team, those groups business objectives might not support others divisions. Ideally, I think Social Media today should be a true Center of Excellence, completely funded independently, and set up like finance or human resources, where the group assigns individuals to support others in the organizations.

This Center of Excellence idea is not completely new. The big difference here is that I am recommending it be treated like finance, legal or HR. Not in terms of being more of an operational role, but rather focused on a stand alone entity that embeds its own people into each group and pays for those people vs. having it be someone from a business group’s part time job. After talking to many companies about how they address social media in their organization, many wrestle with either a) individual groups doing their own thing or b) they only have a few hours a week of a business person’s time.

More on the center of excellence next time I blog here…

Oh yea.. Yes, your data jockey (s) should be part of this team too. : )

Follow up to Webinar on Social Media Tools

Here are the answers to the questions you sent us:

By — Scott Wilder, Gary Angel and Marshall Sponder

As usual I enjoyed the recent Social Media Measurement webinar – and it was great to have Marshall on as well. Tools always draw a crowd and this was no exception. Here’s the questions we got along with our joint answers…

Question: What tools are best for measuring social media ROI or business lift, with respect to advertising on Facebook, Twitter, Linkedin, etc

Marshall: There’s actually a new platform launching next week called Unified ( – I will be at the launch) that promises to do something like that – I’ve seen the platform close up and I can tell you I am impressed.  It may be that 2012 will be a year where ROI will no longer be a totally elusive goal for social media.

Gary: This is far more difficult, I think, than people generally believe. The only easy path to ROI measurement is when user’s are either directly engaged in commerce on social sites (which is rare) or are directly clicking through to sites where they are engaged in commerce. In these cases, measurement is generally a straightforward application of existing Web analytics campaign tracking capabilities. Unfortunately, this isn’t often the case. In some cases, I’m not even sure that ROI is the proper path to measurement and where it is, I don’t think there is likely to be one answer or approach. If your Facebook advertising is directed toward increasing your Fanbase, you need to be able to measure the incremental value of Fan (and this won’t be one value by the way) to your marketing. Getting that measure takes a concerted research effort and won’t (in my opinion) be delivered by any single tool. I sometimes think that it might be better for organizations to – first glance – concentrate on the obvious optimizations points. It’s much easier to measure which campaigns generates engaged Fans and calculate their cost-efficiency in that respect. You can then optimize campaigns within the set of those targeted toward increasing your fanbase. It’s not ideal, but it is more practical.

Scott: In most cases, companies have to guestimate true ROI because of some of the limitations of the tools and also companies own infrastructure. I find it useful to create proxies – like determining cost estimates for certain activities, which in turn, would lead to a transaction.


Question: US cost is too high – example Engage121 is $1000 per month for first base level search – one profile with 3 seats.

Marshal: Well, as Gary pointed out, Engage121 is designed for a specific use case and type of client such as an airline or large franchised business with thousands of stores that each want a different response and editorial controls – think Dominos or Dunkin Donuts (though I think neither are Engage121 clients).  My point being, you can’t take the price of a platform in isolation from the use case and clients for whom it is designed and targeted to.  The Dominos and Dunkin’s of the world have plenty of money and need for this kind of platform – but if your looking for an “affordable point of entry” into Social Engagement- than go with HootSuite and be happy there are still some free platforms you can play with and get your feet wet.

Gary: Not every market is going to be served by a tool like Google Analytics – free and really good. I basically agree with Marshall here. One thing I will say that’s more general is that in my experience some pricing models are much worse than others for doing serious enterprise work. To do our kind of measurement (Semphonic) we need a pretty free hand to construct, test and use profiles of all sorts and we generally need quite a lot of them because all the interesting questions involve categorization. At the enterprise level, I’d much rather pay a significant lump sum for a pretty free hand with the data than have a pay-per-item model. Pay-per-item models tend to cripple analysis.


Question: Do you have preference for tools to measure public opinion about political candidates – public policy or litigation issues?

Marshall: Yes, I am working with one right now – – we are tracking two candidates in Rhode Island and breaking down their overlapping audiences – along with “persona” breakdowns of their twitter streams – here is what that looks like (I erased the names of the candidates because this is still in the very early exploratory stage of what works).

Politcal Social Image

So far, the persona development breakdown looks impressive, as we can break it down by various sub dimensions and the founders at 6Dgree are very willing to pursue my suggestions, which really impresses me about them.  So yes, as of now, I believe 6Dgree might have a winning platform at an affordable price level that works for Twitter and Facebook.  Another is PeekAnalytics, but it’s not adapted specifically to Politics, yet.

6Dgree has done some interesting work with Australian Labor party around issues and produces a weekly portal report that breaks down tweets around several issues – I’m impressed with the solution, but of course, each campaign is slightly different and customization will always be a fact of life.


Question: What are the better tools for global internal scale? If any? Or just by world region?

Marshall: I like Comscore Media Metrix for world reporting – but that’s mostly panel based reporting -but it does a fairly extensive job of categorization of lifestyle and interest across channels, countries and technologies such as video, mobile and search.

Gary: Ditto Marshall. I like NMIncite for many larger markets. Alterian provides excellent language coverage.


Question: Do you believe the sampling of data should include statistical testing? Or how do you ensure your sampling is reflective of the entire population to provide confidence in the recommendations?

Marshall Well, Gary has a pretty good post on that, written recently, and I think, rather than speak to it, I’ll let Gary address it

Gary: Thanks for the plug! Let me know if the several blogs I’ve written on the subject don’t fully answer the question! Social Media Measurement is an odd blend of attempts to get universal coverage and hidden samples – which makes a single approach challenging. You can use statistical testing to measure the variations in your samples and, where possible (it isn’t at all levels) that’s certainly advisable.


Question: When one wants to search and analyze Twitter postings and the topic is very low salience, so likely a very, very small percentage of Twitter mentions in U.S. in a given week, what are the best ways to maximize the amount of Twitter Firehose that you search to catch as many Twitter postings on your low salience topic as possible?

Gary: Depending on your method of access, you might want to start by talking with your vendor (if you’re using a vendor to make the initial data pulls). The initial pull is often tunable. This also speaks to your ability to capture the topic in all its forms. Traditional keyword research of the type often done for long-tail SEO can be useful. There is a range of tools appropriate for this – we’ve also just used scanning tools to pull the text off of sites (both client Websites, communities, and competitors) to try and build rich topic profiles. You can also take advantage of wildcards (in some tools) to scan from hash tags that include but are not limited to your topic. Hash tag references are often concatenations of the topic with other words and are nearly always pertinent. Sometimes, too, you have to be creative about what you’re looking for. If, for instance, you’re launching a product that is distinct, you can’t expect to identify potential influencers by targeting the obvious words – they generally won’t have any traction. So you have to look for analogs that might allow you to find and target a reasonably set of influencers.


Q: Any views on Netbase, which SAP just partnered with?

Marshall: Yes, it seems like a good partnership. Netbase does a pretty good job at NLP and creating structure and meaning around unstructured social data, and rather than SAP trying to build that (or buy Netbase, which is an option) they just partnered with them.

Scott: Netbase is doing some really interesting stuff, especially when it comes to Netnography (see www. I think the partnership with SAP will be good because I know that the company is putting a lot of energy into understanding their own segmentation better. We are doing some work for them right now. SAP is also making a big push in mobile analytics and would probably pull Netbase into.


Question: Gary, perhaps you could ask each speaker to summarize which tool they think is strongest in each of the three key use cases you’ve outlined?

Marshall: Here’s a list of companies to consider

  • For PR Effectiveness  – I’d say mPACT and Cision.
  • For Consumer Sentiment – I would recommend be NetBase (in fact) for its NLP capabilities.
  • For Social Campaign Effectiveness – Unified (once it launches)


  • For PR Effectiveness: NMIncite – though it does a poor job with identifying influencers the segmentation is excellent for tracking them.
  • For Consumer Sentiment: Clarabridge and Crimson Hexagon – though we haven’t gotten to use Crimson Hexagon as much as we’d really like.
  • For Social Campaign Effectiveness: This is a tough one. Most of the new management tools provide some integrated reporting – but I think that really good effectiveness measurement demands that level of reporting plus Web analytics, plus traditional listening configured for the purpose, and maybe CRM-based extracts at the individual level as well (we sometimes analyze Facebook campaigns by extracting all the individuals and looking at their pre/post behavior).

Big Data DNA !!

Recently, a client told me that Big Data is an overused term. Unfortunately, it is also a relatively new area for marketers. A few years ago, bloggers started emphasizing the importance for CMOs to hire marketer who know technology, and now, there is a lot of commentary on the web about hiring data experts. In fact, one of the hot new job titles is data scientist.

One good thing about the Internet is that it has forced even traditional brand marketers to take a more rigorous approach towards analytics. Marketers have to get more data conscious focusing on customer data, prospect data, competitive data, online data, etc. They need to take a more holistic approach to data – and get Big Data thinking in their own DNA and into their team’s DNA.

BigData SW1 300x260 Big Data DNASlowly but surely this is happening. A recent eMarketer survey showed, however, there are still some inconsistencies in how it is defined. 48% of US Data practitioners defined big data as the ‘aggregate of external and internal web base data.’ 21% were unsure to how to even go about defining Big Data.

The  findings are a bit troubling, since Big Data is one of the top priorities of C-level leaders. For example, it is the number one concern of CIOs and is quickly becoming a big issue for marketers.

eMarketer highlighted the fact that more than half the companies they surveyed consider big data as a way to monitor competitors or their own brand. I would not call this Big Data analysis, however. Certainly using a Radian6 or a Scoutlabs  monitoring tool is not the same as doing Big Data analysis. Monitoring ‘what people are saying on the web’ rarely requires a lot of data crunching.

Besides coming up with a consistent definition of Big Data, we also need to find individuals to hire who now how to leverage the tools to crunch big data numbers, have the time to dedicate to a Big Data project, and have the experience to learn from their findings.

BigData SW2 273x300 Big Data DNAPutting definitions aside, it is clear that marketing departments need to get serious about Big Data (large data sets that can’t be handled by traditional tools) and focus on integrating the tools and resources for this area in their organization. I recommend that they hire someone who has experience in handling big data sets. Some of the idea characteristics include:

  • Intellectual curiosity and a strong desire to solve problems
  • Experience in data research
  • Open-mindedness and the ability to look at problems from different perspectives
  • A touch of skepticism to challenge traditional beliefs and practices
  • Ability to frame and communicate ideas based on data findings

After all, ‘if you can’t measure it, how are you going to improve on it?’