In this tutorial we will see how to create, test, publish and use a very simple RSS robot using openkapow. The robot will create an RSS feed that contains only the very top story from
Digg. This tutorial describes the process in great detail and most of it applies to REST robots as well, and in some degree also to Clipping robots.
It is assumed that you already have downloaded the robot development environment RoboMaker and registered as a user on openkapow.com. All images in the tutorial can be viewed at full size, just click on the image you are interested in and it will be opened in a new window. Of course the robot built in this tutorial is downloadable here
Part 1 - Create an RSS robot
Start with opening up RoboMaker, if you are using Windows you will find RoboMaker in your start menu under Kapow Mashup Server 6.3 Openkapow Edition SR1. In the startup wizard choose the option "Create an RSS feed..." and click OK.
In the New Robot Wizard that opens you need to enter the URL the RSS robot should start from, in this case we are going to build an RSS feed from Digg, so we enter "www.digg.com" and click Next.
The next step is to give the robot a name and a description. Per default the name will be "RSS feed from" and the URL name (in this case "RSS feed for www.digg.com"), it is a good idea to change this to something more descriptive and to enter a good description. This information will be visible on openkapow.com once the robot is published there. When you are done entering the robot information click Next.
The last question in the wizard is if we want the robot to take any input values. Since we just want the RSS feed to contain the top story on Digg we do not need any input values. There are other tutorials describing how to use input values to search Digg for a term and create an RSS feed based on the search result, so if you are interested in doing that please take a look at those tutorials. For now we choose "No" and click Finish.
We have now created a very basic RSS robot that contains two steps - one "Load Page" and one "Return Item" step - as you can see at the top of RoboMaker. The Load Page step, does just what the name implies, loads the a page, just like if you used a browser to do the same thing. The Return Item step returns data from the robot and puts it into the RSS feed. The robot will start out with the first step active, which is indicated by the green highlighting of the step. In the right of the screen are the settings for the current step, ie Load Page, and there you can see that it is the URL "www.digg.com" that will be loaded in this step.
If you click on the "Return Item" step all steps before that will be executed, in this case only the "Load Page" step. This means that www.digg.com will be loaded into RoboMakers internal browser view. RoboMaker executes the steps during development in the same way as when a robot is run on openkapow.com. This feature is very usefull as you will always see what state the robot is on a particular step.
On the left side RoboMaker now shows the settings for the "Return Item" step instead of the "Load Page" step, since it is now the "Return Item" step that is active. Below the settings for the "Return Item" step you see the input and output objects this robot uses. The input objects are the input parameters that a robot takes and the output objects are the data the robot returns. Since this is an RSS robot there is automatically an "RSS Item" output object, and since we did not define any input objects in the wizard there are no input objects used in this robot. The "RSS Item" output object is what the robot will return when it is executed, so what we need to do now is to define what data the robot should return.
Part 2 - Define what data the robot should return
We just want this robot to return the title of the top Digg story and the URL of the same story. What we want to do is to get that data from the page that the Load Page loads and put it into the output object RSSItem that then the Return Item step will return as output from the robot. To do this make sure that the Return Item is active, so that the www.digg.com page is loaded. Then click on the title of the top story in the RoboMaker browser view.
Notice that the story title is highlighted in 4 different places, in the browser view, in the HTML source code, in the HTML path and in the DOM (don't worry if you do not know what all this means, you do not need to know this if you are not going to make much more complicated robots than this one). We have clicked on the "a" tag around the story title and now we want to get the URL and the acctual title text from that tag and put it into the RSSItem output object. Let's start with getting the title text. Simple right click in the browser view on the title and choose "Extract title".
Several things have now happened. A new step called "Extract Title" have been inserted between Load Page and Return Item.
If you take a look at the lower right of RoboMaker at the output object RSSItem you see that it now contains the title of the top Digg story. Try clicking on the Extract Title step and you will see that the title of RSSItem no longer has a value. Then click on the Return Item step again and see the RSSItem title once again have a value. This is another example of how RoboMaker executes each step in the robot during development, so you can always see exactly what values your variables have in a specific step in your robot.
Let us take a look at the generated Extract Title step. Click on the Extract Title step so you can see the configurations of that step in the right of RoboMaker. The step configuration consists of 4 tabs, let us take a quick look at each tab. For a more detailed look at exactly what configurations each step has please refer to the RoboMaker help and documentation. The first tab is the "Basic" tab, that contains the name and comment of the step. This is a perfect place to document the step by a breif description and a good name. Changing the comment is shown as a little document icon on the step and pops up when moving the mouse over the step.
The second tab is the "Tag Finders" tab, where it is configured what HTML tags the step should interact with. In this case it is the tag path to the top Digg story's link tag. In most cases there is no need to manually edit the tag finders, instead we can just automatically generate them as we have done here. But if we want to change the Tag Finders we have many options on how to do so, please see documentation and other tutorials for more information on this.
Tab number 3 is "Action" which defines exactly what this step is going to do. In this case it will extract data from the tag defined in the Tag Finders tab and put the extracted data into the attribute RSSItem.title. Just as for the Tag Finders tab there is generally no need to manually edit this information, the generated settings are very good in most cases.
Finally we have the "Error Handling" tab in which we can set how errors should be handled if they occure. This step will be discussed in other tutorials and in this tutorial we are just assuming that the world is prefect and that everything works as it should at all times. All steps have the same "Basic", "Tag Finders" and "Error Handling" tabs, it is only the "Action" tab that differs from step to step.
We now have extracted the title of the top Digg story, but we also want the URL of that story. This is done in almost the same way as we did when extracting the title, right click on the title in the browser view and instead of choosing "Extract Title" we choose "Extract URL".
A new step called "Extract URL" is then created and in the RSSItem output object the "url" attribute now contains the URL of the top Digg story. Our robot now has 4 steps. This is a very simple robot and we can already be quite confident that it works, but to be safe we want to test it, so time to check out the RoboMaker debugger.
Part 3 - Test the robot
Compared to most development environments there is a lot of debugging going on as you do the acctual development. If a variable has the wrong data or if something do not work you will notice as RoboMaker execute each step during development. For the simple robot we have done in this tutorial there is probably not a need for much more testing than that, but for larger and more complicated you probably want to test that the robot can handle incorrect and missing input data etc. To do this we use the RoboDebugger which you open by clicking on the little blue bug in the RoboMaker icon bar.
This opens the RoboDebugger in a new window and at the top of the window we se the steps of our RSS robot. Here we can run the robot and see what output it produces, if we had used input objects we could also test with different input values. Either we can run the whole robot or we can single-step through the robot. For each step we can see what values each object attribute has. We can also put in breakpoints so that the debugger stops at a specific step. Let's start by clicking the blue run arrow icon and see the robot be executed.
This simple test show us what output data the robot would return, and as we suspected it all works perfect. Let us put in a breakpoint at the Extract URL step and see how that works. Right click on the Extract URL step and choose "Toogle Breakpoint".
The Extract URL step now has a little blue square indicating that it has a breakpoint. Let's run the robot again, this time it will stop on the Extract URL step.
When the debugger has stopped check out the state tab, there you can see exactly what state the robot is in when it stopped. Here we see that the robot has extracted the RSSItem.title but have not yet extracted the RSSItem.url. Breakpoints are shown both in RoboDebugger and RoboMaker, but they are just used in the debugger.
A nice feature in the RoboDebugger is the Go To button in the icon bar that will return us to the main RoboMaker window with the robot at the same location (and with the same state) as it stopped at in the debugger. This is very usefull when you have a big robot and you find a bug when testing in the RoboDebugger since it makes it very easy to identify what the error is and where to fix it. Now the robot is tested and done, time to publish it to openkapow.com so we can use it.
Part 4 - Save, publish and run the robot
We have a pretty cool robot already, but as long as it is just local on your computer it is quite pointless. So let us publish it to openkapow.com so we can use the RSS feed in our RSS Reader so we can keep updated with what is the current top story at Digg. The first thing to do is to save the robot locally. This is not necessary to publish the robot to openkapow.com, but it is always a good idea to do so anyway. Just take "Save" from the "File" menu and save your robot file on your harddrive.
Next thing to do is to publish the robot. This means that the robot is uploaded to openkapow.com and can be run from there using a simple URL in a browser or a program (for example in your PHP, JSP, ASP or Ruby on Rails code). Once a robot is uploaded to openkapow.com it is publically available for everyone to use and download. In order to publish a robot you need to be registered as a user at openkapow.com and you need to have put in your openkapow username and password in RoboMaker. You probably did this when you installed RoboMaker already, if not simply go into the "File" menu and take "Edit Username and Password". When the username and password are correctly configured click the little openkapow robot icon to open the publish dialouge box.
The title and description of the robot are already filled in, based on what you entered when first creating the robot. Feel free to edit that information if you want to. Each robot needs to be categorized as belonging a category, in this case we take "Tutorials and Examples" (if you do a similar robot yourself it should be published in the "News" category since Digg is a news site). Each robot can also have tags associated with it. These tags makes it easier to find a robot when searching on openkapow.com, so add the tags you think best describe your robot. Since we are publishing an RSS robot we can also choose how often it is going to run. RSS robots are executed by the openkapow.com server at a preset frequency and then the result is cached on the server. This makes it much quicker to get the RSS feed when calling it from your RSS Reader or Browser. Let's leave that at the default which is that the robot will be run once an hour.
Once we are done with adding tags and configuring our robot click publish. RoboMaker then contacts openkapow.com and uploads your robot. When this is done you will get back the URL through which the robot is now available to use. Add this URL to your RSS Reader (Netvives, My Yahoo etc) and you have your own RSS feed which is just like any other RSS feed.
When opening that URL in a normal browser we will see the RSS feed XML returned. Use the URL in your RSS Reader and you will instead see the top story on Digg, and it will be updated every hour.
If you want to find the URL to one of your already published robots you can find that in "Robot Configuration" in the "File" menu in RoboMaker. There you can also change what RSS version the RSS feed will be using. The robot is now also available on openkapow.com for everyone to use, you can download and use the robot yourself from the robots page.
Summary
We have with a few clicks of a mouse created a RSS feed containing the top story from Digg. We haven't written any code at all, not even changed any of the automatically generated configurations. This is a good example how easy it is to develop robots and how quickly you will see results. Of course it would be a much better robot if we could return all the stories of Digg instead of just the first one and this is exactly what we are going to do in the next tutorial.