Open Calais Tags is a PHP class for extracting entities from text using Open Calais. Calais performs semantic analysis of the text, using natural language processing to identify concepts like people, companies and technologies discussed in the text. These are especially useful for suggesting tags for your content such as website articles or blog posts. You could even automatically tag archived content that would take days to go through manually.
You can download the class and example usage here:
dg_open_calais.zip (updated 7/9/2008)
Calais is free for both personal and commercial use, and usage of this class requires a Calais API key. Getting an API key is an easy, automated process. Just click the “Register” link at the top of the site, then request a key through their automated system.
The Open Calais Tags class takes a content string as input, as well as a number of options, and returns a multidimensional array as output. The array’s keys are the entity types detected in the text, and the values are the entities found.
Example input:
April 7 (Bloomberg) — Yahoo! Inc., the Internet company that snubbed a $44.6 billion takeover bid from Microsoft Corp., may drop in Nasdaq trading after the software maker threatened to cut its bid if directors fail to give in soon.
If Yahoo’s directors refuse to negotiate a deal within three weeks, Microsoft plans to nominate a board slate and take its case to investors, Chief Executive Officer Steve Ballmer said April 5 in a statement. He suggested the deal’s value might decline if Microsoft has to take those steps.
The ultimatum may send Yahoo Chief Executive Officer Jerry Yang scrambling to find an appealing alternative for investors to avoid succumbing to Microsoft, whose bid was a 62 percent premium to Yahoo’s stock price at the time. The deadline shows Microsoft is in a hurry to take on Google Inc., which dominates in Internet search, said analysts including Canaccord Adams’s Colin Gillis.
Example output:
Array
(
[Industry Term] => Array
(
[0] => Internet
[1] => software maker
[2] => Internet search
)
[Person] => Array
(
[0] => Steve Ballmer
[1] => Jerry Yang
[2] => Colin Gillis
)
[Company] => Array
(
[0] => Google Inc.
[1] => Canaccord Adams
[2] => Yahoo!
[3] => Microsoft Corp.
)
[Currency] => Array
(
[0] => USD
)
)
Basic usage is simple. Create an instance of the class with your API key, and call the getEntities method using your content string.
require('calais.php');
$oc = new OpenCalais(’your-api-key’);
$entities = $oc->getEntities($content);
A number of settings exist which can be changed through setters on the OpenCalais object:
- setAllowDistribution: true or false. Indicates whether the extracted metadata can be distributed by Calais. Defaults to false.
- setAllowSearch: true or false. Indicates whether future searches can be performed on metadata through the Calais API. Defaults to false.
- setExternalID: Allows you to set an ID for the content to pass on to Calais when it’s submitted for analysis. Defaults to empty string.
- setSubmitter: Allows you to set an identifier for the content submitter. Defaults to ‘Open Calais Tags’.
- setContentType: Allows you to specify the type of content you’re submitting. Can be text/xml, text/txt, or text/html. Defaults to text/html.
- setOutputFormat: Allows you to specify the format of the returned results. The API currently only supports xml/rdf.
- setPrettyTypes: Determines if the keys of the return array will be prettified or in the raw format returned by the Calais API. For example, Calais returns the entity type “IndustryTerm”. If set to true, the array key will instead be “Industry Term”. Defaults to true.
This class is distributed under an open source BSD license. The license terms can be found in license.txt of the code archive.



Kev
April 8th, 2008
Anyone to create and maintain a Wordpress plugin for tags auto-suggestion ?
David Peterson
April 8th, 2008
Nice!
Dan
April 8th, 2008
Kev: I’m hoping to work on that some time this week when I get the chance.
Neha
April 9th, 2008
hey Dan…
i am new to this.Just downlaoded your source files and tryingto run it..
can you please tell me how to use your calais class..
$entities = $oc->getEntities($content);
the function getEntities returns empty string..
i have placed proper key in the source code..
are there any pre-requisites..
i downloaded calais-client..bu not able to execute submissio-tool.bat…
can you please help
thanks
Dan
April 10th, 2008
Neha: Does $content contain a string of English content with entities Calais will recognize?
Kev: The initial WordPress plugin for tag suggestion’s now available here:
http://www.dangrossman.info/wp-calais-auto-tagger/
Dan Grossman » WP Calais Auto Tagger: Automatic Tag Suggestion For Your Posts
April 10th, 2008
[...] just completed the WP Calais Auto Tagger plugin, the obvious first use of my Open Calais Tags class. It adds a tag suggestion box to your WordPress post writing screen which suggests tags based on [...]
Neha
April 10th, 2008
The content is same as your example input.
I was just trying to run your source code. Downloaded the zip and put it in my web folder and added my Licenseid.Do I need to do anything else.
Dan Grossman » Tagging Large Post Archives Automatically
April 11th, 2008
[...] a PHP class for passing content to Open Calais and getting back tags. Then, a WordPress plugin for tagging [...]
PHP Weekly Reader - April 13th 2008 : phpaddiction
April 15th, 2008
[...] never in a million years related to it, until of course I saw the tag. The class in the article Open Calais Tags might be what I need, I’m sure it will make its way into Zend Framework by next week. Oh YAY it is [...]
Neha
April 16th, 2008
hey dan can you tell me some place where i can test ur PHP class..
or give me the sample input…
Dan
April 16th, 2008
Neha: The sample output above came directly from the example input above.
nico.
April 16th, 2008
Hey Dan,
Thanks a lot for the file!
I just think you forgot ‘Country’ on line 75
cheers
Dan
April 16th, 2008
Thanks for mentioning that nico, I’ve added it here and in the copy bundled with the plugins.
Neha
May 9th, 2008
Hi Dan,
I know that Dan. I downloaded the zip class file for php.Hosted it on my web. And added my API key in octest.php.I am trying to run it. I get an error saying
Warning: Invalid argument supplied for foreach() in C:\wamp\www\opencalais\octest.php on line 27
$response = html_entity_decode(curl_exec($ch));
this line returns nothing
Can you please tell me whats wrong
Do I need anything else except the API key to use your class.
the eXternal mind » links for 2008-05-13
May 12th, 2008
[...] Dan Grossman ยป Open Calais Tags (tags: php library tagging calais opencalais api) [...]
Tom
June 9th, 2008
Hey Dan,
Was using your OC class on my site and had such a breeze getting it working forgot all about it. Now it seems that something might have changes with the OC API as now it always returns no suggestions. Have you released a new version in line with the new API if it has indeed changed?
Cheers,
Tom
Dan Grossman » Open Calais PHP Class Updated
July 9th, 2008
[...] updated my Open Calais PHP Class with the entity types added in Calais’ last update. It now matches a bunch of new [...]
Andy
July 27th, 2008
@Neha: If you havent got it working yet, may be you might want to check your php.ini ( if on windows ) for whether this line extension=php_curl.dll
is uncommented or not and if the dll is actually in place.
@Tom
Just ran the example today. Seems to work fine without any alteration to code itself.
@Dan
Great job mate thanks. I am now going to try and combine this with Lucene.
Just a query about your WP exploits. Did you manage to get it working. If so I believe you would be storing the tags into a database? If so could you give me a hint on the table structure.
Cheers
Dan
July 27th, 2008
@Andy: WordPress supports tags out of the box, you don’t need your own database. WP Calais Auto Tagger, WP Calais Archive Tagger
Andy
July 27th, 2008
oh.. ok.
i thought i read that you were going to work on something like that?
haha may be i was sleep reading. anyway thanks for the links
Andy
July 27th, 2008
oh and i forgot to mention the query about the tags wasnt for WP itself.
i am working on a different sort of application in which auto-tagging would help and just thought what was the optimum way to execute a tags DB.
now that you have implemented it for WP i shall snoop around and dig in, to have a look at the data handling and the table structure.
cheers
Benjamin Hourigan
August 21st, 2008
When I try to activate the WP Calais Auto Tagger plugin in WP 2.6.1, I get this error:
Parse error: syntax error, unexpected T_CONST, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or ‘}’ in /home/18214/domains/benhourigan.com/html/wp-content/plugins/opencalais.php on line 17
What is happening?
Dan
August 22nd, 2008
You don’t have PHP 5, Ben.
Benjamin Hourigan
August 23rd, 2008
Thanks, Dan.
Sammy Kanan
September 2nd, 2008
Great!
Is there any chance of using https instead of http in the request?
Lee
September 19th, 2008
According to http://opencalais.com/APIcalls:
“HTTP POST - Obsolete
In older versions you could also invoke the service via an HTTP POST request using the following URL: http://api.opencalais.com/enlighten/calais.asmx/Enlighten.”
Does this mean this is about to be obsolete since that URL is what you use?
Dan
September 20th, 2008
Hi Lee, it looks like HTTP POST is still supported but they’ve changed the URL a little. I’ll try to update the class and plugins soon.