WP Calais Archive Tagger

The Calais Archive Tagger plugin automatically goes through your archives and tags every post you’ve written. The plugin uses the Open Calais API to perform semantic analysis of your post text and suggest tags. If a post already contains a suggested tag, that tag isn’t added, but other new tags found are. It takes about 5 minutes to tag 200 posts.

Also see the Calais Auto Tagger plugin, which adds tag suggestion to your post writing screen. These two plugins work together to make tagging both new and past content simple, but can be used separately as well.

Calais Archive Tagger

The Calais Archive Tagger requires you have an Open Calais API key. Getting a key is easy as filling out two forms — it’s an instant, automated process. First, go to the Open Calais site and use the “Register” link at the top of the page to create an account. Then, request an API key by filling out this form. Enter your API key on the Calais Configuration tab of your plugins page.

Calais Archive Tagger is compatible with WordPress 2.3+ and WordPress 2.5+ blogs. It is free for personal and commercial use, but may not be redistributed without permission. Please e-mail me if you want to do that.

Current Version

Version: 1.4
Release Date: 3/23/2009
Download: WP Calais Archive Tagger at the WordPress Codex

Version 1.1 adds a rate limiter (2 posts processed per second) to ensure you don’t exceed the Calais API rate limit (2 requests per second and 40,000 requests per day). I’ve also wrapped the API call in a try/catch block so any exceptions won’t result in a loop condition. Version 1.2 adds a check to make sure old tags are never lost when adding new ones, and no longer adds e-mail addresses found as tags.

Notes

I recommend backing up your WordPress database before using this for the first time. There is no risk of damaging the database, as this plugin uses WordPress API functions to add the tags (no direct database access), but if you’re not happy with the tags it adds, you may want the ability to undo the additions easily.

This plugin relies on the Open Calais Tags PHP class, which requires PHP 5 web hosting with PHP’s cURL extension enabled (the majority of web hosts). Also see my blog stats plugin for W3Counter.

Installation

Unzip the archive and upload the files to your wp-content/plugins directory. Then activate the plugin from the plugins tab of your WordPress administration area. You’ll now have a “Calais Archive Tagger” link on your plugins menu where you can enter your API key and start the tagging process.

  • http://blog.kaplak.com/ Morten Blaabjerg

    I can echo what Thy is experiencing. That is about the only thing I can’t get to work. All links to tag pages work fine and there are no ;’s in the tag slugs.

    Thank you for a very nice plugin. I took the liberty of creating an adapted version for FeedWordPress users, available here : http://www.kaplak.com/wiki/index.php?title=FWP_Calais_Autotagger

    This adaptation tags each individual item as it comes in via the FeedWordPress plugin.

  • Pingback: » WP Calais Archive Tagger - WordPress Plugins Catalog()

  • http://osprojects.info/ Lynne

    Using this on WP2.6.3 it worked like a charm. However, on WP2.7 Beta 2 and Beta 3 the semi-colons that Thy and Morten are reporting magically appeared.

    It may be that Calais has altered their results in the week between using it on 2.6.3 and when I used it again on 2.7 Beta. It doesn’t appear to be coming from the plugin.

    Since I don’t intend to run the tagger over old posts again (now that they are nicely tagged) I simply used the WP Search & Replace plugin to search the terms table eg. wp_terms and delete all instances of ; in the names. This can also be easily done with a SQL query in the database.
    Hope that helps someone.

  • http://leeclemmer.com Lee

    Experiencing the same problem with the semi-colons (e.g. “tag;”). Reported the issue on the OpenCalais forum to see if it’s coming from their side:

    http://opencalais.com/node/11332#comment-578

    Let me know if any of you have found a resolution to this!

    Thanks,
    – Lee

  • Jamison Fitzgerald

    Great plugin, works great, had to edit to remove the trailing ; but all in all its pretty nice, got it setup to run through cron, good job. 🙂

  • Pingback: 12 Must have wordpress 2.7 Plugins :: What is a Blog()

  • http://www.articlesnatch.com matt

    love the plugin – any plans on updating it to still work after march 15th? The api.opencalais.com will be shut down then – they have the new R4 format.

    http://opencalais.com/news/calais-40-update-test-now-full-40-release-coming-march-15th

    thanks

  • Joss Winn

    I just wanted to add to the questions about whether you plan to update the plugins for WP 2.7 and Calais R4 compatibility?

    Thanks for everything you’ve done with the plugin so far.

  • Pingback: The Bumpy Rolling Out of Kaplak Stream - And What Not To Do To Piss Off Google — Kaplak Blog()

  • Hermann

    Hi there mate, great plugin!
    I am testing the plugin and while it looks perfect on short posts, it looks like it builds too many tags on long posts.

    I have some 2,600 posts to tag and after a while the plugin becomes slow due to page scroll up and down, could you disable the output or find a way using java-script to output on the same line what is the current record and how many records are left?

    Also it would be perfect to enable a feature to resume where the last interruption occurred, without reprocessing the whole db, still allowing rebuild from the beginning (two options)

    Regards

  • Hermann

    Ditto! The plugin has hanged twice, due to connection errors.. now it starts again from the beginning!
    Is there a simple way to modify your plugin to skip tagged posts?

  • Hermann

    I did find a way to skip posts…. you must change the first javascript call from (0) to whatever you want.

  • http://bloggerfaq.co.cc Blogger Tips

    loved the plugin but l agree with chriss, plugin should/have to own more features like tagging posts for last #number of posts or tagging posts for lat week or for this month or taggingg posts between dates it is very important tagging posts with dates otherwise starting tagging from beginning to last is a waste of resources and time for everytime… why l say it cos l am using some automatic rss fetching plugins like wp-o-matic and l try to tag them with ur plugin so l made a cron job to runeverynight but cos of the number of posts cron doesnt work cos l have around 2500 posts and getting more daily which sounds impossible to rub tagging from beginning to last everytime or it stops or use very high resources or firefox cant stand it and crash down of ram useage.
    what l advice is very important for the future career of ur plugin.
    Tagging between dates, tagging for last xx posts or tagging for last xdays, xweeks, xmonths are very important.
    Also the cron u specify for this plugin also should include the features l showed, if l set a cron to run every monday of week, cron should tag for the posts in the week only.. 🙂
    l wish its possible what l say here 🙂
    good luck…

  • http://www.tiendanotebooks.com.ar Ariel

    Hi Dan,
    Your plugin is a great idea, but It didn’t work for me.

    Here is my hosting info:
    I’m running WP 2.7.1
    I’ve my blog inside /blog directory
    PHP Version 5.2.0
    cUrl enabled (libcurl/7.11.2 OpenSSL/0.9.7f ipv6 zlib/1.2.1)
    My blog is in spanish language
    I don’t have installed the plugin calais-auto-tagger

    The error:
    When I start the process, it shows me an empty message (no tags detected) for each post, like this:

    Tagged post #32:
    Tagged post #33:
    Tagged post #34:

    Can someone help me with this issue?
    Thanks, Ariel

  • http://www.dangrossman.info Dan

    The plugin is working, Calais is just returning no tags. It probably doesn’t work for your language, or requires you to tell it the language, which the plugin can’t automatically do.

  • http://N/A LEIVA

    Wow this plugin works great i download it tonight and work perfect, but is possible to run it with a cron job ? or something similar without the cron job to scan for new post each day and add the tags automatically i have so many blogs and all the blogs i run with WP-o-Matic without the cron job and i don’t want to login in each blog to click the Tagger, is any way to run the tagger automatically with out to login to each blog ? Thanks

  • Pingback: Tagging old post backcatalog with WordPress « Josh.st()

  • http://marissamendoza.net sikiş

    This plug in is awesome. but my api dont worked :S

  • http://catch22blog.com FlashLight

    I tried the Auto Tagger, and would give me language not supported errors; then I thought I should use the Archive Tagger first. It would hang on post #4. Then I disabled the Auto Tagger.

    The Archive Tagger is working just fine, producing tags (maybe some need a revision), but does not add them to the posts. At least I can’t figure how to add them to posts.

    Can anyone help?

    Thx.

  • http://catch22blog.com FlashLight

    Some waiting is required. The result is not delivered immediately.

    Sorry.

  • http://www.simonwhatley.co.uk Simon Whatley

    Hi Dan

    I am running WP2.8.4 and have installed your Calais Archive Tagger.

    I have installed a valid OpenCalais API key as I aslo run your other Calais plugin, successfully.

    When I start the process, it shows me an empty message (no tags detected) for each post, like this:

    Tagged post #3:
    Tagged post #3:
    Tagged post #3:
    Tagged post #3:

    I am running on PHP 5.2.9

    cURL is enable with the following information:
    libcurl/7.19.4 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5

    My blog is inside a sub-directory named “blog”.

    Can you tell me what may be wrong? Unlike other commenters, I am running an english-language blog with approximately 250 posts.

    Thanks,
    Simon

  • http://www.simonwhatley.co.uk Simon Whatley

    The solution appears to be given by @flashlight. The archive tagger does not work if the auto tagger is also active. Deactivating the autor tagger then the archive tagger will run fine.

    Only a minor headache! Great functionality though, thanks Dan.

    Thanks

  • Apurv

    Hi! This is a lovely plugin but I have a question.

    Does it _replace_ the original tags of the archived stories or only _adds_ to them?

  • http://www.perspicuousasmud.com Mark

    Hi, I have been trying out this plugin on my site which is using wordpress 2.8.4

    Everything is fine, except for one thing, which isn’t fatal but does interrupt my flow.

    Archive tagger picks nice keywords etc. It saves the updated post just fine, but when I click on the Update Post/Publish Post button I get the following error:
    Error 500 – Internal server error

    An internal server error has occured!
    Please try again later.

    I don’t get this for anything else. I tried uninstalling the plugin and the error went away, but came back when I reinstalled it.

    I had a looky around the permissions, and they all seem to be as one would expect, so I’m a little foxed right now. Any thoughts?

  • http://www.perspicuousasmud.com Mark

    Dan, I take back everything I just said! I was using the taggaroo plugin!
    Sorry,
    Mark

  • Pingback: WP Calais Archive Tagger | Blue Orbs()

  • Jenny

    I tried this and it worked great. It would be even better if we could limit how many tags it adds. I would want say 3-6 tags per post. Some posts ended up with like 20 tags.

  • matteo borsacchi

    Hello Dan, thanks for this useful plugin.
    Any chance to make it work only on the last # posts added?
    It’s annoying to retag an entire archive everytime…

  • Pingback: The Bumpy Rolling Out of Kaplak Stream – And What Not To Do To Piss Off Google — Kaplak Blog()

  • GUEST

    I have this exactly same problem – any solution to this???

  • Pingback: The blog post as a scientific article: citation management « Henry Rzepa()