News companies find risks, opportunities in Internet’s annotation layer
Tech Trends | 29 July 2015
Recently I got to spend five weeks in Silicon Valley as part of a work assignment. These five weeks gave me time to meet with lots of fascinating companies and to spend time with their founders.
The meetings were all uniformly interesting. Silicon Valley’s confidence in approaching existing challenges as well as its ability to re-think existing problems from a unique prism — integrating tech, design, and innovation — is hugely commendable.
Unfortunately for the readers of this blog, almost all of the meetings were in the area of education, or “edtech,” as the Valley terms it. (I am in the midst of a transition where I am increasingly focused on our education endeavours). The one exception was Declara, a company that straddles the worlds of media and education, leveraging its prowess in Big Data and analytics to pry open opportunities in media and education, where it sees immense potential.

Declara is a fascinating company, with an even more fascinating co-founder, Ramona Pierson, who may even be the subject of a movie soon. But this isn’t perhaps the right place for that story.
Rather what I will highlight is Declara’s media play, with its avowed efforts to build a personalised news engine. To enable understanding of audience tastes and interests, Declara encourages its users to build “collections,” grouping articles the company finds interesting.
It also provides a facility where a user can highlight a phrase/sentence/paragraph that she finds interesting, and comment or share (via social channels) with others. Over time, with repeated usage, the algorithm learns more and more about the user via her collections and mark-ups, and is able to use that to curate news in the form of a personalised feed for the user, with the insights highlighted to enable a quick read.
More interesting than the personalised newsfeed is the facility of highlighting, commenting, and sharing “insights” that Declara provided its users. This is remarkably similar to what Genius (previously RapGenius, as it initially set up the site to annotate and explain hard to understand rap lyrics) is trying to do. The A16Z-funded Genius describes its efforts as attempting to build an annotation or explanatory layer to the web. Declara terms it as an insight layer.
Genius is one in a long line of start-ups that have focused on creating an annotation layer. Hypothes.is, a Knight Foundation-funded non-profit initiative that is focused around annotation standards, lists more than 50 attempts, mostly failures (see list). Clearly, a lot of bright minds see this as a challenge and a considerable amount of brainpower has been expended on it.

Why is building an annotation layer of such interest to the world? What is it that excites these start-ups? How lucrative is the opportunity?
Allow me to answer the above by making three distinct but related points.
- At present, it isn’t entirely clear how the likes of Declara or Genius plan to monetise their future dominance over the annotation layer. True, there is potential for ad revenue. Declara plans to embed ads within Insights, visible to readers or recipients of those insights. However, given the overall downward trend in CPMs for online advertising, I am not sure ads alone will not cover costs or lead to a compelling model.
One potential upside that might exist around monetising might be video. Declara has patented a technology that makes it easy to snip the relevant 20 to 30 seconds or even larger time blocks of a video (say a dramatic moment in an interview), and annotate and share it. If so, I can see a short video ad appended to the 20- to 30-second snippet as a compelling proposition. But there are many ifs here. - Ad model apart, the interest in annotation is also driven by the broken commenting system on content Web sites. Most message boards or commenting sections, unless well moderated, descend into mudslinging, abuse, and spite. We have seen that with Reddit recently, as earlier with Gawker’s Jezebel comment boards, which were trolled heavily.
Annotation, with its ability to link comments to specific texts and passages, allied with upvoting/downvoting from the community, can keep commenting relevant and focused.
During a recent discussion with education platform Edmodo, I was told that Edmodo’s platform hosted more information about YouTube video’s suitability for specific education subjects than YouTube itself. Similarly there is a lot of context, discussion on Facebook or Twitter, around an article than on the article itself.
Today these are distinct comment buckets, in silos. It is unlikely that these will ever be integrated, but there is a genuine case for a unified view of all the chatter about an article. And therein lies the case for an annotation layer such as Genius/Declara. - Lastly, there is scope for entirely new revenue streams. Genius has encouraged media companies and even journalists to embed its annotation platform (by pasting a few lines of code) into the site, and thereby making the site annotation-friendly.
If this is adopted by more and more Web sites, this has the potential to create a true annotation layer to the Web. And with increasing adoption, I do see possibilities for new revenue streams:
- What if you could pay $X to see all the annotations by Marc Andreessen? Or Nassim Nicholas Taleb? And what if you had a service that streamed all annotations made by a list of celebrities/influencers in real time? Wait, now doesn’t that sound like Twitter?
- On a parallel note, there is at present no site or even tool to aggregate or rank the best comments, highlighting them to the general public. But if the annotation layer becomes a reality, then aggregating the most upvoted comments becomes possible. It is interesting that Twitter presently has no easy way to find out the most re-tweeted or favourited comment on a particular topic (or hashtag). While user-generated content has benefited a large number of sites, highlighting or ranking the best user-generated content hasn’t been too high on any media site’s priority list.
- A lot of the interest in annotation/comment is also led by the potential it raises for discoverability. All that chatter, commentary, backgrounders — I propose the term mezzodata as an all-encompassing term for these — are essentially data that helps us in understanding, classifying, ranking even. For a company like Google, there would be immense interest in having access to such mezzodata. If a true annotation layer took off, and Google’s bots were locked out of indexing the layer, I can imagine Google ponying up serious money to acquire a Genius.
It is particularly interesting that the interest in annotation is closely linked to a focus on atomization of content. There has been a lot of interest in providing tools at the story level, e.g., Medium, WordPress, etc. But other than Twitter, I can’t think of any mainstream attempt at providing tools to grapple with content at the atomic level. And even Twitter, in recent times, has expanded its data scope upwards, thanks to shortcodes, tweet storms, and tweet shots. The over-arching trend has been to help Twitter expand its data scope, not reduce it.
This is why annotation and the tools it provides is of particular interest to the content community. By helping focus attention at content at an atomic level — and by adding intelligence at this level — the annotation layer can certainly help readers absorb greater amounts of content, faster, and better than they did before.
News sites in particular have a lot to benefit from the annotation layer, but they also have a lot to lose. Aggregating out all that mezzodata out to a third party can be dangerous — all that intelligence stripped out and monetised outside. But by partnering intelligently with start-ups focused on enabling explanatory commentary at the atomic level, news media companies can make their content richer and more meaningful, increasing engagement levels and insight.