The Calais Archive Tagger plugin automatically goes through your archives and tags every post you’ve written. The plugin uses the Open Calais API to perform semantic analysis of your post text and suggest tags. If a post already contains a suggested tag, that tag isn’t added, but other new tags found are. It takes about 5 minutes to tag 200 posts.
Also see the Calais Auto Tagger plugin, which adds tag suggestion to your post writing screen. These two plugins work together to make tagging both new and past content simple, but can be used separately as well.

The Calais Archive Tagger requires you have an Open Calais API key. Getting a key is easy as filling out two forms — it’s an instant, automated process. First, go to the Open Calais site and use the “Register” link at the top of the page to create an account. Then, request an API key by filling out this form. Enter your API key on the Calais Configuration tab of your plugins page.
Calais Archive Tagger is compatible with WordPress 2.3+ and WordPress 2.5+ blogs. It is free for personal and commercial use, but may not be redistributed without permission. Please e-mail me if you want to do that.
Current Version
Version: 1.4
Release Date: 3/23/2009
Download: WP Calais Archive Tagger at the WordPress Codex
Version 1.1 adds a rate limiter (2 posts processed per second) to ensure you don’t exceed the Calais API rate limit (2 requests per second and 40,000 requests per day). I’ve also wrapped the API call in a try/catch block so any exceptions won’t result in a loop condition. Version 1.2 adds a check to make sure old tags are never lost when adding new ones, and no longer adds e-mail addresses found as tags.
Notes
I recommend backing up your WordPress database before using this for the first time. There is no risk of damaging the database, as this plugin uses WordPress API functions to add the tags (no direct database access), but if you’re not happy with the tags it adds, you may want the ability to undo the additions easily.
This plugin relies on the Open Calais Tags PHP class, which requires PHP 5 web hosting with PHP’s cURL extension enabled (the majority of web hosts). Also see my blog stats plugin for W3Counter.
Installation
Unzip the archive and upload the files to your wp-content/plugins directory. Then activate the plugin from the plugins tab of your WordPress administration area. You’ll now have a “Calais Archive Tagger” link on your plugins menu where you can enter your API key and start the tagging process.



Dan Grossman » Tagging Large Post Archives Automatically
April 11th, 2008
[...] First, I wrote a PHP class for passing content to Open Calais and getting back tags. Then, a WordPress plugin for tagging posts as you write them. Now, taking it one step further again, here’s a plugin for automatic tagging of your post archives. [...]
SHaiTaaN
April 12th, 2008
the problem i m facing is the plugin got activated , entered API key … and clicked tagging now the status i m getting is …
Status Tagging in progress…
its been 30 mins now and i havd got 177 post so how long it will take to tag all.. i suppose its not working.
Dan
April 12th, 2008
You already said in your other comment that you don’t have PHP 5. This plugin requires PHP 5. It won’t tag any posts for you. WordPress has also ended support for PHP 4, which is obsolete by several years. Get your host to upgrade.
Ahni
April 12th, 2008
Great plugin Dan, this can be extremely useful. Though I wish there was some way to limit the number of tags it grabs… After processing the first 90 posts, It made 1500 tags (including over 2 dozen phone numbers and obscure sentences)… I’m afraid to see what it’ll look like in the end… :0
Anways, I’m a little bit curious to know if it’s possible to organize the tags like we see here? That would probably be a lot of work, but I thought I’d ask anyways.
Cheers.
looks like it’s going to take several days to make all the tags though. I
I wish there was a way to somehow limit the number of tags it picks up. I got 900 posts on my site, and with the limited number of queries per day it seems to have stopped making tags at #95. (I’m testing it on a local version of my blog.)
One thing
Chris Masse
April 12th, 2008
Hi,
Have you seen this comment?
wordpress.org/support/topic/168436
-
When we still see “Status: Tagging in progress…” and the page does not refresh with new line of tags, what should we do? Should we abort it and redo it again? Or should we wait a long time?
-
This plugin worked great on a little blog of mine, and stalled on another small blog. No idea why.
-
Thanks a lot,
Chris Masse
Chris Masse
April 12th, 2008
Hi,
This plugin deletes old plugins… which is not a good thing.
Thanks for listening.
Chris Masse
Dan
April 12th, 2008
@Chris: You should see lines showing up immediately after that. If you don’t, you’re probably running PHP 4 or don’t have cURL, which are required. The plugin definitely does not delete other plugins.
@Ahni: It has that metadata on what type of entity each tag is, but tags aren’t exactly hierarchical in WordPress. You’d have to leave the tag system to keep that metadata on the tags, right?
The Calais rate limit is 2 queries per second and 40,000 queries per day. That should be plenty to handle a 900 post blog. Perhaps your connection is actually so fast that it sends and receives the request in less than half a second despite no parallel processing?
Chjis Masse
April 12th, 2008
I meant “This plugin deletes old TAGS”… Sorry for the typo.
-
I am on DreamHost. they run PHP 5.2.3.
-
I think they have “cURL”…
-
Thanks.
Chris Masse
Chjis Masse
April 12th, 2008
So to recap:
1. Your plugin seems to freeze after a while.
2. Your plugin deletes old tags, instead of adding new tags and leaving intact the old tags.
-
If these 2 problems could be solved, then this plugin would be great.
hanks a lot,
Chris Masse
Dan
April 12th, 2008
Chris: Please wait 15 minutes and re-download the plugin from the WP plugin site. It’s updated so that it ensures no old tags are deleted in the tagging process.
Ahni
April 12th, 2008
You’d have to leave the tag system to keep that metadata on the tags, right?
Ah, I see. Well, it would still be a great feature to have (tag categories!) Perhaps this is something WP will add in the future.
your connection is actually so fast that it sends…
Yeah, that must have been it. It kept going this time, but I seem to have run into another problem. It stopped creating new tags and started mirroring the ones I’ve added in the past. Is there any chance the rate limiter you added skips adding tags if the server’s too fast? (btw I’m testing it on a local install of my blog)
Dan
April 12th, 2008
It should be showing both old and new tags for each post listed. I had it add the existing tags to the list before the save_tags call to deal with what Chris reported. Perhaps a change to how they’re displayed will clear that up.
I’ve updated the plugin again so that it displays only the tags from Calais, even though it still preserves any tags already on the post. WordPress updates the .zip archive on their site every 15 minutes, so within 15 minutes of this comment you can get the update.
Chris Masse
April 13th, 2008
The plugin worked well on my 2 small blogs. However, on the big blog, the process stopped after post #194. (I have over 4,000 posts.) If you or Calais could solve this problem, then that would be great.
One important feature to add to your plugin would be to have a range of posts to tag… instead of tagging all…
Like: Do tag only posts from May 2007… or do tag only posts ID#34 to post ID#230. That way, next time we re-run this plugin, we wouldn’t have to re-tag the old tags already tagged by this plugin in a previous session….
The location of this plugin should be under “Manage”, for its tagging functions… and under “Plugins” or “Options” for the API keys.
Just my 2 cents,
Chris
Chris Merriman
April 13th, 2008
Note for any other Bluehost customers, who find they are still running on PHP4 boxes - You do NOT need to contact tech support to be swapped over any more. Go to your CPanel, click PHP Config, then change to PHP5 or PHP5 FastCGI . Users of other hosting companies might find they can do the same, but I only use BH, so can’t test, sorry.
About to take the plunge and auto tag some 1800 posts now
Thanks for all your work Dan.
Alex
April 13th, 2008
It seems the API cannot handle languages different than English…
When I use the plugin on my Italian-written blog I got a long queue of errors starting with:
Fatal error: Uncaught exception ‘OpenCalaisException’ with message ‘Unsupported document language’ ……
I consider that a not-so-little limitation….
Chris Masse
April 13th, 2008
The plugin seems to appear 2 times at wordpress:
wordpress.org/extend/plugins/wp-calais-archive-tagger/
wordpress.org/extend/plugins/calais-auto-tagger/
Mao-B
April 13th, 2008
I´m sorry i have to say this, but this plugin is rubbish, at least for my blog. None of the created tags have anything to do with the postingcontent. what ever it does, it is not the semantic analysis i thought first of.
Chris Merriman
April 13th, 2008
Second note - this time for people not using their own PC when running auto-tag archives…
Make sure that FireFox is set to run new searches in a NEW tab. It is a little depressing to get 70% of the posts tagged, search for something, then realise that you’ve just wasted all that time
If future versions of the plugin could support either breaking down the process by categories/months, or if an internal counter could be set, so work would not be repeated after you stop for whatever reason, that would be great.
To Mao-B above, sorry to hear that Calais’ service didn’t hit the nail on the head for you, it seems to be doing fairly well for me so far. A small minority of posts have no tag at all created, but I’ll look into that later, and see if there is some sort of obvious pattern. Just a though, does Calais’ semantic search service definitely work on languages other than English?
The Best Blogging Software (WordPress) + The Top 60 WordPress Plugins | Midas Oracle .ORG
April 13th, 2008
[...] WP Calais Archive Tagger 1.2 » Dan Grossman (url) Tags your entire post archive by performing semantic analysis on the post text. [...]
Chris Masse
April 13th, 2008
@ MAo-B
Surely this plugin is not perfect and needs the reviewing by a human being after the automatic tagging. But it does many good. It does put many good tags. we can later delete the bad tags. Or we can use other tools (like search and Replace Tages) to refine and finish the tagging process.
-
This plugin is a good start to tag old posts that have no tags.
-
Ahni
April 13th, 2008
Hey, thanks for your efforts Dan. I must have missed the latest updates (because all my old tags are gone now) but I’m happy to say it finished without a hitch
As a general comment about Calais, I think it does need a bit more work. There are a number of keywords it didn’t pick up on that I would think it should have. For instance, I have many topics about gold, titanium, and uranium but it made no tags for these words (above all, that’s what I was hoping it would do.)
In any case, thanks again Dan.
Dan
April 13th, 2008
@Chris Masse: Please slow down on the commenting! Those two plugins are different. One tags your archives, the other adds tag suggestion to your post writing screen. I’ll keep your suggestion about incremental processing in mind.
@Chris Merriam and Alex: Calais only supports English language text. It’s still a beta product, and I believe additional languages are part of the third milestone on their roadmap. As they create more ontologies, it’ll recognize more entities within the text.
Heffo
April 13th, 2008
Hi Dan, I get the following error when trying to activate the ‘WP Calais Archive Tagger’ plugin:
Plugin could not be activated because it triggered a fatal error.
Parse error: parse error in ….\wp-content\plugins\calais_archive_tagger.php on line 132
Are you aware of this issue? Is it a problem with the file or my WP?
I’m using a mac and tried Safari 3.1 and FF 3.0b5.
Thanks, Heffo
Dan
April 13th, 2008
@Heffo: This plugin requires PHP 5. You only have PHP 4.
Heffo
April 13th, 2008
Ah ok. Thanks Dan.
David
April 16th, 2008
I assume you know of this prize: http://www.semantic-web.at/index.php?id=1&subid=57&action=resource&item=1646
Dan
April 16th, 2008
Yeah David, unfortunately the deadline to send a proposal for the bounty was in March, and I didn’t see it until this month.
Matt Ellsworth
April 21st, 2008
I just tried to use this on several sites. It worked GREAT on the sites that were small. However I tried it on one with about 4000 posts. I let it run for a while - but it started to make firefox use up 100% cpu (on a quad core box - so it just grabs one core).
It would be great to have a pause/resume button.
It would also be cool if it didn’t list every post, but rather just show a revolving list of 10 or so.
Matt Ellsworth
April 24th, 2008
I thought I would post back an update… I let it run on the site with 3500 posts. It ran fine - it took about 10 hours or so and firefox would periodically go from using 0% to 100% of the cpu (that one is a 2ghz machine). But I just let it run.
I’m now running this on another blog with about 10,000 posts, and i’m just going to let it go, and see how it does. so far so good.
Matt Ellsworth
April 24th, 2008
Me again… sorry about all the comments… I figured out that if you want to stop it part way through - just make note of the post number.
1. open up the file calais_archive_tagger.php
2. Go to line 80 (at least in my file)
3. Look for this
Status: Click here to start tagging your posts.
See where it says calais_archive_run(0) - replace 0 with the post id that you want to start with.
This worked for me. Hope it helps.
and dan- thanks again for this great plugin!!!
indi
April 28th, 2008
Just wondering before I bork my blog. The plugin has stopped generating tags after post 635. Should I start it again? Will there be double tags?
Jim
May 25th, 2008
Hi Dan,
Great work: thanks for your contribution. Everything works as advertised on my admittedly very small blog.
A feature suggestion, if you don’t mind (but I don’t know if it is possible): Could the archive-tagger either mark posts that have been processed so one can re-run the plugin without re-processing posts that have already been processed, or could the user choose and limit which posts to tag, perhaps starting from a given date, category, or page(s).
Thank you again for your efforts.
OpenCalais tagging implemented on blog
June 20th, 2008
[...] WP Calais Archive Tagger [...]
indi.ca » Infinite Kottu
June 21st, 2008
[...] other thing I’d like to do is tags. I can tag all the old posts automatically using this hook-in to the Reuters Semantic Engine (Calais). Pretty cool thing they’ve built, it picks out [...]
Shaun Robinson
June 25th, 2008
This plug in is awesome. Is there any way to tag pages and not just posts?
irulbyzan
July 15th, 2008
Dear Dan,
I would like to ask a question about this plug in,from the description available in your plugin i suggest it’s a good pulgin but when i try to upload and active m plugin for few minute appear message fatal eror.
example:
Plugin could not be activated because it triggered a fatal error.
Parse error: syntax error, unexpected ‘{’ in /home/archi/public_html/wp-content/plugins/wp-calais-archive-tagger/calais_archive_tagger.php on line 132
and i can’t actived my plugin. may you help me please…..thank before.
Dan
July 15th, 2008
@irulbyzan: This plugin requires PHP 5, while you’re trying to run it on PHP 4.
Philix
July 23rd, 2008
Thanks, this is exactly what i was looking for.
Lee
July 23rd, 2008
Hey Dan,
Very cool little plugin, perfect for a feed aggregator. The only thing that was missing was the ability to automatically add tags periodically, without having to do it manually. See with feed aggregation, you don’t actually create the posts yourself, nor even ever look at the Writing screen, so a different solution was needed.
I took the liberty to hack together a file, calais_cron_tagger.php, which basically is your file slightly modified and trimmed down and cron readable. So if needed just plop that file into your wp-calais-archive-tagger folder, set up a cron job to point to the file in question, and you’re ready to rock and roll. Posts being tagged while you’re sleeping, kinda cool.
You can download the file here:
http://www.leeclemmer.com/calais-cron-tagger.rar
It’s a pretty dirty hack but works.
Enjoy, and thanks again!
Greetz from Philly, 215 w00t!
- Lee
Lee
July 23rd, 2008
PS: reading through the above comments, it seems that some users were having firefox problems: as this little “cron_tagger” actually works in the backgroudn (and sends output as email), this may be useful for people with a lot of posts… just a thought.
Leonaut.com
July 25th, 2008
WP Calais Archive Tagger…
The Calais Archive Tagger plugin automatically goes through your archives and tags every post you’ve written. The plugin uses the Open Calais API to perform semantic analysis of your post text and suggest tags. If a post already contains a suggested …
Pratik Sinha
August 7th, 2008
Is there any chance this would work with PHP4? My hosting service has PHP4 and no PHP5.
Saulo Benigno
August 7th, 2008
Ok, hey Dan, just like ‘Chris Masse’ (no reply for his comment) said:
“The plugin worked well on my 2 small blogs. However, on the big blog, the process stopped after post #194. (I have over 4,000 posts.) If you or Calais could solve this problem, then that would be great.”
I have one blog with more tha 4,000 posts too, and is stops on #72 or #96 or #196… it stops.. random times.
Any fix for that ?
Thanks.
Dan
August 7th, 2008
You’re probably running into the maximum execution time limit set in your PHP configuration. Unfortunately the plugin doesn’t keep track of where it left off to resume processing, so there’s no easy fix from my end (until I have time to write a new version). You can set_time_limit(0) to see if your host allows you to override the setting.
Saulo Benigno
August 7th, 2008
Well, i’m using the fix posted by “Matt Ellsworth”, it’s working.
Thanks.
» WordPress: Auto Tag while you sleep! WPCalais meets the cron job » Lee Clemmer {dot} com
August 20th, 2008
[...] (8/20/2008): Please be aware that you need to have the Calais Archive Tagger installed first for this to [...]
How to Enable Auto-Tagging for Syndicated Content
September 4th, 2008
[...] right in… First, you’ll need to install and activate the Calais Archive Tagger plugin. (BTW, you will also need to get an API Key by going to http://opencalais.com/) The problem with [...]
Matt
September 11th, 2008
Hey - great plug! Would you mind posting a snippet so that I can make the plugin skip posts that have any tags at all. A simple checkbox in the next release would be grand!
Thanks.
10* Proven plugins to make your Wordpress Blog pop
September 28th, 2008
[...] WP Calais Archive Tagger [...]
Thy
November 3rd, 2008
Hello Dan,
This is a nice plugin, thank you for your great work. Actually i have some issue with the output of the tags. There is a ( ; ) added next to the anchors text of the tag links (e.g: tags: google;, usa;, people; ). its there any part of the code i can delate to remove the ( ; )?
Thank you
Morten Blaabjerg
November 11th, 2008
I can echo what Thy is experiencing. That is about the only thing I can’t get to work. All links to tag pages work fine and there are no ;’s in the tag slugs.
Thank you for a very nice plugin. I took the liberty of creating an adapted version for FeedWordPress users, available here : http://www.kaplak.com/wiki/index.php?title=FWP_Calais_Autotagger
This adaptation tags each individual item as it comes in via the FeedWordPress plugin.
» WP Calais Archive Tagger - WordPress Plugins Catalog
November 15th, 2008
[...] Plugin Homepage » [...]
Lynne
November 17th, 2008
Using this on WP2.6.3 it worked like a charm. However, on WP2.7 Beta 2 and Beta 3 the semi-colons that Thy and Morten are reporting magically appeared.
It may be that Calais has altered their results in the week between using it on 2.6.3 and when I used it again on 2.7 Beta. It doesn’t appear to be coming from the plugin.
Since I don’t intend to run the tagger over old posts again (now that they are nicely tagged) I simply used the WP Search & Replace plugin to search the terms table eg. wp_terms and delete all instances of ; in the names. This can also be easily done with a SQL query in the database.
Hope that helps someone.
Lee
December 15th, 2008
Experiencing the same problem with the semi-colons (e.g. “tag;”). Reported the issue on the OpenCalais forum to see if it’s coming from their side:
http://opencalais.com/node/11332#comment-578
Let me know if any of you have found a resolution to this!
Thanks,
- Lee
Jamison Fitzgerald
December 21st, 2008
Great plugin, works great, had to edit to remove the trailing ; but all in all its pretty nice, got it setup to run through cron, good job.
12 Must have wordpress 2.7 Plugins :: What is a Blog
January 13th, 2009
[...] WP Calais Archive Tagger: This is a must have for tagging old content, especially if you are importing it from another source such as Blogger or LiveJournal. It effectively tags your entire post archive by performing semantic analysis on the post text. On some sites I run this about once a week and let it add any newly important tags that might be potentially good for searches. [...]
matt
February 23rd, 2009
love the plugin - any plans on updating it to still work after march 15th? The api.opencalais.com will be shut down then - they have the new R4 format.
http://opencalais.com/news/calais-40-update-test-now-full-40-release-coming-march-15th
thanks
Joss Winn
March 19th, 2009
I just wanted to add to the questions about whether you plan to update the plugins for WP 2.7 and Calais R4 compatibility?
Thanks for everything you’ve done with the plugin so far.
The Bumpy Rolling Out of Kaplak Stream - And What Not To Do To Piss Off Google — Kaplak Blog
March 20th, 2009
[...] The first installment of Kaplak Stream came with just about fifteen feeds, of which a handful were submitted by owners of niche websites. Others were feeds from sites such as YouTube, Amazon.com, Twitter (tracking particular subjects or keywords) and Boing Boing. Enough to provide the stream with some variety and “head” which would also test the autotagging performed by Open Calais via a modified version of Dan Grossman’s WordPress plugin. [...]
Hermann
April 12th, 2009
Hi there mate, great plugin!
I am testing the plugin and while it looks perfect on short posts, it looks like it builds too many tags on long posts.
I have some 2,600 posts to tag and after a while the plugin becomes slow due to page scroll up and down, could you disable the output or find a way using java-script to output on the same line what is the current record and how many records are left?
Also it would be perfect to enable a feature to resume where the last interruption occurred, without reprocessing the whole db, still allowing rebuild from the beginning (two options)
Regards
Hermann
April 12th, 2009
Ditto! The plugin has hanged twice, due to connection errors.. now it starts again from the beginning!
Is there a simple way to modify your plugin to skip tagged posts?
Hermann
April 18th, 2009
I did find a way to skip posts…. you must change the first javascript call from (0) to whatever you want.
Blogger Tips
May 1st, 2009
loved the plugin but l agree with chriss, plugin should/have to own more features like tagging posts for last #number of posts or tagging posts for lat week or for this month or taggingg posts between dates it is very important tagging posts with dates otherwise starting tagging from beginning to last is a waste of resources and time for everytime… why l say it cos l am using some automatic rss fetching plugins like wp-o-matic and l try to tag them with ur plugin so l made a cron job to runeverynight but cos of the number of posts cron doesnt work cos l have around 2500 posts and getting more daily which sounds impossible to rub tagging from beginning to last everytime or it stops or use very high resources or firefox cant stand it and crash down of ram useage.

what l advice is very important for the future career of ur plugin.
Tagging between dates, tagging for last xx posts or tagging for last xdays, xweeks, xmonths are very important.
Also the cron u specify for this plugin also should include the features l showed, if l set a cron to run every monday of week, cron should tag for the posts in the week only..
l wish its possible what l say here
good luck…
Ariel
May 3rd, 2009
Hi Dan,
Your plugin is a great idea, but It didn’t work for me.
Here is my hosting info:
I’m running WP 2.7.1
I’ve my blog inside /blog directory
PHP Version 5.2.0
cUrl enabled (libcurl/7.11.2 OpenSSL/0.9.7f ipv6 zlib/1.2.1)
My blog is in spanish language
I don’t have installed the plugin calais-auto-tagger
The error:
When I start the process, it shows me an empty message (no tags detected) for each post, like this:
Tagged post #32:
Tagged post #33:
Tagged post #34:
Can someone help me with this issue?
Thanks, Ariel
Dan
May 4th, 2009
The plugin is working, Calais is just returning no tags. It probably doesn’t work for your language, or requires you to tell it the language, which the plugin can’t automatically do.
LEIVA
June 11th, 2009
Wow this plugin works great i download it tonight and work perfect, but is possible to run it with a cron job ? or something similar without the cron job to scan for new post each day and add the tags automatically i have so many blogs and all the blogs i run with WP-o-Matic without the cron job and i don’t want to login in each blog to click the Tagger, is any way to run the tagger automatically with out to login to each blog ? Thanks
Tagging old post backcatalog with WordPress « Josh.st
June 19th, 2009
[...] Calais Archive Tagger, a free WordPress plugin, did most of the heavy lifting for me. It connects to a web service called OpenCalais, run by ThomsonReuters (so nothing dodgy is going on with your data, they’re a pretty big publishing conglomerate!) The biggest problem with it is that, given the particular emphasis of OpenCalais towards establishing commonalities between different data sets, it paid a disproportionate amount of attention to proper nouns, and when product names were incomplete (for example, my old Pentax SP500 camera that I often just referred to as “SP500″) it would match tags to other products that had a more complete title. Which would be excellent if that were, in fact, what I was talking about. [...]
sikiş
June 21st, 2009
This plug in is awesome. but my api dont worked :S