ted serbinski – entrepreneur & web architect
  • thoughts
  • about
  • contact



Popular content

  • Theming in Drupal 5.0: We're getting there!
  • TWiT.tv
  • "Black & Blue" Drupal Theme Released
  • Sony Musicbox
  • SimpleMenu 3.0 released!
  • Live from London
  • Gorgeous wallpapers
  • Nolo: Law articles without the cruft
  • Widescreen Mac Mail
  • Rebuilding a BMW intake: S52 to M50 intake manifold conversion, day 2
more

Recent comments

  • Unfortunately this method
    2 weeks 1 day ago
  • I’m using this method to sort
    7 weeks 20 min ago
  • I was interested in reading
    8 weeks 5 days ago
  • Ah yes this code is a bit out
    12 weeks 2 days ago
  • After using the original code
    12 weeks 2 days ago
more

Topic “tags”

Automatically Extracting Tags from Nodes

Automatically tagging content is becoming easier with services like OpenCalais and Yahoo Terms Extractor, offering their APIs for free semantic analysis of content. There’s even a great Drupal module, Auto Tagging (with a great writeup on usage) that ties these services together and makes it even easier.

However, there is still one common issue with these services: they really need nicely written, rich, keyword dense articles to produce the most logical, semantic tags.

Try any of those services with user generated content and you’ll see a common tag each time around: FAIL.

We experimented with over 20,000 pieces of content on MothersClick and our results showed that these semantic services weren’t producing quality & relevant tags: rather, we were getting very little, if any relevant tags for our user generated content.

After a little more trial and error, I then noticed a simple pattern: more often than not, the title to a user’s post usually had the most applicable keywords to what their post was about, rather than the body of the post.

So how to extract just the keywords and make tags from the title of a node?

posted 20 Nov 2009
  • drupal
  • jquery
  • tags
  • 1 comment
  • Read more
  • 1 attachment
Code examples and downloadable zip files of code are licensed under a Creative Commons License.
All other content, unless where noted, ©2010 Theodore Serbinski. All Rights Reserved.