jump to navigation

Meaning = Data + Structure October 22, 2007

Posted by jeremyliew in data, meaning, semantic web, structure, user generated content.
trackback

Through Techcrunch, I saw the video “Information R/evolution” embedded below (5minutes, worth watching):

The video’s key message is that when information is stored digitally instead of in a material world, then our assumptions about how to get to information, and how information gets to us, are substantially disrupted, allowing for high quality (and quantity) user generated, organized, curated and disseminated content.

It’s an entertaining video and spot on. However, I think it glosses over one key point about make information truly useful. User generated content, often unstructured, can be very hard to navigate and search through. Adding structure makes the data vastly more meaningful.

Search engines are the best example of how adding structure (a search index) to an unstructured data set (the list of all websites) makes the dataset more useful. Whether that structure is established by link popularity (as Google and all modern search engines do) or by human editors (as Yahoo started out) affects the size and quality of the structure, but even a rudimentary structure built by humans is better than no structure at all.

Social networks are another great example of how adding structure (a social graph) to an unstructured data set (personal home pages) improves the data’s usefulness. There were plenty of successful examples of personal home pages and people directories in the late 90s , including Tripod and AOL’s People Connect, but none of them had the high levels of user engagement that MySpace, Facebook, Bebo and the current generation of social networks have.

One of the key themes of Web 2.0 has been the rise of user generated content. Often this content has been largely unstructured. Unstructured data is hard to navigate by search – you need to rely on the text, and that can be misleading.

Take one of my favorite websites, Yelp, as an example. If I do a search for diabetes near 94111, I get one relevant result (i.e. a doctor) in the top 10 – the rest of the results range from tattoo parlors to ice cream parlors, auto repair to sake stores. All contain the word “diabetes” in a review, some humorously, others incidentally.

This isn’t a one off either; try baseball mitt, TV repair or shotgun. In every case, the search terms show up in the text of the review, which is the best that you can hope for with unstructured data.

Recently I’ve started to become intrigued in companies who are adding structure to unstructured data. There seem to be at least three broad approaches to this problem:

1. User generated structure
2. Inferring structure from knowledge of the domain
3. Inferring structure from user behavior.

I’m not smart enough to know if this is the semantic web or web 3.0, or even if the labels are meaningful. But I do know that finding ways to add or infer structure from data is going to improve the user experience, and that is always something worth watching for.

I’m going to explore the three broad approaches that I’ve seen in subsequent posts, but would love to hear reader’s thoughts on this topic.

I’ve found this post on the structured web by Alex Iskold to be very helpful in thinking about this topic.

Comments»

1. My Personal “Keep Me Up To Date On The Top News” blog » Meaning = Data + Structure - October 22, 2007

[…] Check it out! While looking through the blogosphere we stumbled on an interesting post today.Here’s a quick excerptThis isn’ta one off either; try baseball mitt, TV repair or shotgun. In every case, the search terms show up in the text of the review, which is the best that you can hope for with unstructured data. Recently I’ve started to become … […]

2. Nik - October 22, 2007

Hi Jeremy,

What do you think of existing services that could be examples of each approach? I can only think of examples for the 2nd approach. For e.g. taking vertical information/search sites (e.g. Product review aggregators like BaazaarVoice or Powerreviews) taking unstructured data and then making sense of it because of domain expertise?

3. lawrence - October 22, 2007

The trade off for entrepreneurs is that unstructured UGC data seems to grow faster than structured. But structured data can provide more value to consumers, and is easier to monetize with keyword driven ad networks.

From a UGC perspective, site administrators can force structure by requiring every site contribution to have a parent category, or descriptive tags. The problem is that the more obstacles you put in place before content can be submitted, the less participation you are going to get.

The dream scenario is a system that is able to accept unstructured user generated content, then crunch it in an automated fashion, and output it in a way that has structure. Like Google, I guess.

4. Brendan Taylor - October 22, 2007

> I’m not smart enough to know if this is the semantic web or web 3.0, or even if the labels are meaningful.

That’s exactly what it is (don’t be intimidated by it, the idea of the SW is really just as simple as you’ve laid out).

Strictly speaking, the term “Semantic Web” usually means using RDF (a data format optimised for making links between different data sets), but really any structured data on the Web is an improvement.

5. Brendan Taylor - October 22, 2007

Nik, anything where a person’s input goes directly into a database is an example of the first approach. eg. my friends and interests on Facebook, my del.icio.us bookmarks, my purchases on Amazon, etc.

There’s tons of structured data out there, it’s just mostly only exposed to the public in unstructured forms (eg. big blobs of HTML).

6. jeremyliew - October 23, 2007

Nik and Lawrence- I’m going to take a crack at both your questions in subsequent posts

7. Meaning = Data + Structure: User Generated Structure « Lightspeed Venture Partners Blog - October 24, 2007

[…] Meaning = Data + Structure: User Generated Structure October 24, 2007 Posted by jeremyliew in Consumer internet, data, meaning, semantic web, structure, user generated content. trackback I’ve been thinking about how the explosion of user generated content that has characterized web 2.0 can be made more useful by the addition of structure, ie meaning = data + structure. […]

8. CoryS - October 24, 2007

Jeremy –

During some exploratory interviews into research labs, an interesting problem exists that is likely one important facet of getting to a structured web: finding an efficient transition from today’s vast pool of unstructured data to a structured one (which may fit into # 2 or 3 on your list, or something completely different in terms of technology).

When talking to the executive director of U. of California’s tech transfer office, he noted quite insightfully that they’ve got a huge knowledge repository, but the idea of having to force users to re-enter it all into a new structured database was simply ridiculous and that attacking the problem of existing data transition should be first on any semantic web technology project.

Clearly the ability to make more data associative across UGC platforms (as Brendan notes) has the potential to create value so long as the transparency in access to the data has reciprocal value for the host sites. That’s always the rub, isn’t it.

Look forward to your expanded thoughts here.

9. alexiskold - October 25, 2007

Jeremy,

What your post puts into spotlight is relevancy. Relevancy is what is often lacked in our web experiences today, because the information is not structured and computers are bad at figuring out what we want.

To understand how we can make our experiences more relevant, we look at how people interact in real life:

1) Our interactions are contextual
2) We understand each other via language
3) We learn over time, i.e. the games are iterative
4) We have the chance to ask for clarification

Stunningly, none of the 4 critical things are present today on the web.
Search is just one time deal, the data has no meaning, there is no or little learning and memory and computers can only clarify misspellings, but not misunderstandings.

The next web is about progressively moving towards more real-life like interactions and the structure is a key piece of it.

Thanks for reading my posts,

Alex

10. Meaning = data + Structure: More thoughts on user generated structure « Lightspeed Venture Partners Blog - October 27, 2007

[…] data, meaning, metadata, semantic web, user generated content. trackback My post claiming that Meaning = Data + Structure and follow up post exploring how User Generated Structure is one way that structure can be added to […]

11. Meaning = Data + Structure: Inferring Structure from domain knowledge « Lightspeed Venture Partners Blog - October 29, 2007

[…] web, I’ve been thinking about how to draw more meaning from the content, and the idea that Meaning = Data + Structure. A number of readers commented on my previous post, about user generated structure. They point out […]

12. BlueBlog: Awesome Series On Meaning and Structure From Jeremy Liew - October 29, 2007

[…] His original post, introduces the problem and lays out three approaches that transform unstructured information into structured: […]

13. redopinion.com » Blog Archive » Awesome Series On Meaning and Structure From Jeremy Liew - November 2, 2007

[…] His original post, introduces the problem and lays out three approaches that transform unstructured information into structured: […]

14. MashLogic: Building the Adaptive Web » Blog Archive » From Information Retrieval to Intent Reconciliation - November 5, 2007

[…] recent weeks, my RSS reader has delivered a bunch of great articles and announcements related to the Semantic Web, structure and meaning, and […]

15. Meaning = Data + Structure: Inferring structure from user behavior « Lightspeed Venture Partners Blog - November 19, 2007

[…] trackback A little while ago I started a series about the structured web where I claimed that Meaning = Data + Structure. I followed up with a couple of posts on ways that structure can be added to user generated […]

16. Web 2.5 = structure/semantics with a little help from humans « please mr editor…. - December 4, 2007

[…] Or as Jeremy Liew of Lightspeed Ventures calls it, “meaning = data + structure“. […]

17. 2008 Consumer Internet Predictions « Lightspeed Venture Partners Blog - December 7, 2007

[…] Meaning = Data + Structure. Search on user-generated sites has not been a great experience so far. This year we should start […]

18. Incep sa apara definitii pentru web 3.0 CTI97:=(Catalin Istratoiu); - January 3, 2008

[…] Or as Jeremy Liew of Lightspeed Ventures calls it, “meaning = data + structure“. […]

19. Border Crossing Stats » Meaning = Data + Structure Lightspeed Venture Partners Blog - March 11, 2008

[…] Get more information about this from the author here […]

20. Wenamba.nl » Blog Archive » De betekenis van gegevens - November 3, 2008

[…] niet de enige die hier al eens over heeft nagedacht. Op het blog van Lightspeed, in het artikel Meaning = Data + Structure wordt er gekeken naar verschillende methoden om data betekenis te geven. De keuze hierbinnen is […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: