Welcome to openkapow Sign in | Join
in Search

Tutorials

Enjoy our tutorials on building your own robots and mashups.

Creating an RSS robot with an input value to search Digg

In this tutorial we will see how to create and test an RSS robot using that takes an input value.  As input the robot will take some search terms that it will use to search on Digg, then the robot will return an RSS feed based on the search result.

This tutorial builds on the other RSS tutorials about how to build a basic RSS robot and how to use For Each and Repeat-Next loops in the RSS robot. It is assumed that you have read and understood these tutorials and that you already have downloaded the robot development environment RoboMaker and registered as a user on openkapow. All images in the tutorial can be viewed at full size, just click on the image you are interested in and it will be opened in a new window. Of course the robot built in this tutorial is downloadable here.

Part 1 - Create an RSS robot that has an input value

We start by opening RoboMaker and creating a new RSS robot that starts from www.digg.com. When the "New Robot Wizard" gets to the "Use input values?" screen we choose "Yes".

Enter "searchText" into the text field and then click add. For this robot we do not need more than one input value, otherwise we just add more here as needed.

On the next screen we add the default value for the "searchText" input value. This is the value we will use while developing the robot. In this case we will need a text to search for in order to get any results from Digg. So let us use the term "robot" during development. Once this robot is done and published to openkapow.com the input value "searchText" will not in any way be hardcoded to "robot", instead this will be a parameter that anybody using the robot can specify in the robot call (more about this later).

Now we click on "Finish" and we get the standard RSS robot with a "Load Page" and a "Return Item" step. Since we defined an input value there is one difference though, so let us look at the Input Objects of our new robot. The RSSInput object has 10 name-value pairs. In this case we only specified one input value, so the attribute "name1" is set to "searchText" and the attribute "value1" is set to "robot". What this means is that when somebody calls this robot via an openkapow.com URL using the parameter "searchText" (for example http://service.openkapow.com/....?searchText=robot") then the value of that parameter will be available within the robot in the "value1" attribute. Note that if the robot has several input parameters the order does not matter, the parameter "searchText" will always be mapped to name1-value1 (in this case).

Part 2 - Use the input value to search Digg

Now we need to use the input value to interact with the page. This is very simple to do. Right click in the little search box that Digg.com has on top of it's page. Then choose "Enter Text from Attribute" and choose the attribute "RSSInput.value1 (searchText)".

We have now added a new step to our robot - the "Enter Value 1" step. When it is executed you can see that it writes the text "robot" into the search text input field, just like if you would have typed it in on the web page.

Since we have entered the text to search for we just need to add a Click step that clicks on the search button.

Clicking the search button will load a new page with the search results, and this is of course the search results that we want the robot to return in it's RSS feed. Let's add a For Each loop, some extracts and a Repeat-Next loop that returns all the stories from the 3 first pages of search results. If you are not familiar with how to do this there are a couple of other tutorials that covers this in detail. When this is done we should end up with a robot that looks something like this:

But all we really know right now is that the robot seems to work fine when the input is "robot", we do not know how the robot will react to ny other input. To test this we need to test the robot.

Part 3 - Test the robot and add error handling

Open the RoboDebugger. In the debugger we can see what input values the debugger will use when executing the robot, in this case it will of course be "robot".

Run the robot in the debugger once to check if it works as it should. Then change the text for "value1" in the RSSInput object to be "cool" and run the robot again. Hopefully this will also work, and this means that you have already built a robot that can use dynamic input values. Of course the users of our robot can send in any values they want and so far we have only tested the robot with input values that did return several stories on Digg. But what happens if we use a value for "searchText" that does not return any search results? Let's try, enter the text "asdfgh" and then run the debugger. The result is not very good, the robot fails since the For Each step does not find any tags to loop through.

If we go to the "Windows" tab we will see exactly why the robot failed, there were no results when the robot searched Digg for the text "asdfgh".

So the robot can not handle any searchText value. To enable our robot to handle this we need to add some error handling to the robot, this has to be done in RoboMaker and not in RoboDebugger. To move from the debugger to RoboMaker simply click the  button on the error tab in RoboDebugger. This will open RoboMaker with exactly the state that you had in RoboDebugger when the error happened. That means that now the For Each step will be selected and the RSSInput.value1 will be set to "asdfgh".

There are several ways in which we could improve the robot to handle bad input. We could handle the error on the For Each step as it occures now, or we could test for a result before even moving on to the For Each step. In this case let us test for the text "No results Found" after the search, and if the page contains that text we know that nothing has been found. Once we have decided where to handle the error we also have to decide how to handle the error. Is it enough that the robot does not crash, or do we need to return an error message so that the caller of our robot is informed of the error? Since this is an RSS robot it is probably enough if it does not crash. When building REST robots it is much more important to return an error message that the caller of the REST service can user.

Let us add the error handling to the robot. Click on the "Repeat" step to make that active and then click on the "No results found" text in the browser view. Moving out in the DOM (click on the arrow icons, in the HTML path, in the browser view or in the HTML source view) until you identify the div with the class "notice" that contains the "No results found" text. Right click on this div and add the condition "Test Tag".

In order to get the "Test Tag" step to work as we want it to we need to configure it a bit. When we added the step like we did the steps Tag Finders got set to find the notice div, but now we need to define what the step should test in this div. That is done in the steps Action tab, and in the subtab Basic. Enter the pattern ".*No results found.*" (without the quotes), change the match against to "only text" and the action to "continue nif pattern matches found tag".  All this means that the step will check the div tag and see if it contains the "No results found" text, and if it does the step will let the flow of execution to proceed to the next step.

If the notice div does not contain the "No results found" text we want the robot to extract all the data and return the RSS feed. To do this we need to change 2 more things about the Test Tag step. The first is that the step needs to generate an error if the text in the div isn't "No results found". To do this we go to the "Error Generation" tab and check in "Generate Error When Stopping".

Now there are 2 scenarios when the Test Tag will generate an error - when the notice div does not contain "No results found" and when the notice div is not found at all (ie there acctually is a valid search result). If we go to the "Error Handling" tab on the step and set the steps "own errors" to be "sent backwards". This means that if the Test Tag step generates an error this error will be moved back to the previous step and it needs to be handled there. Why we do this will soon be apparent.

The Test Tag step is now all configured, but the steps in the robot are not in the order we need them to be. Currently the Test Tag is between "Click Submit" and "Repeat", which means that if the search on Digg does not find anything the robot proceeds to trying to extract things, otherwise the Test Tag generates an error which it sends to the Click step and this step does not handle that error. Clearly some changes are needed to the robot. The first thing to do is to add a new connection between the Click step and the Repeat step. We do this by moving the mouse to the end of the Click step until a small white arrow shows up, then hold down the mouse button and drag the arrow to the Repeat step.

When we have done that we right click on the connection between Test Tag and Repeat and choose delete. Now we have a robot that splits in 2 branches after the Click step.

If we would execute the robot right now it would first go to the top branch and run the Test Tag step and then (assuming the robot don't crash due to an unhandled error) run the lowewr branch that starts with the Repeat step. We do not want this, we want the robot to test the first branch, and if (and only if) that branch fails it should go to the lower branch and start looping and extracting. To do this we need to configure the error handling in the Click Submit step, change the "Branching Mode" from the default "All Branches" to "Until Successful Branch". This mean that only if the top branch returns an error will the Click step proceed to the lower branch. Note that the connections after the Click step now are dashed lines.

This is the way to build up "if-then" structures in RoboMaker. If an error occures send it backwards. The step that receives the error have "unitl successful branch" set and moves on to another branch if the first fails. In this example we just have 2 branches, but you can add however many branches you need after the Click step. For more information about branching and error handling take a look at the RoboMaker documentation.

The robot is now ready to test in RoboDebugger again, but before that let's add one more step that does not add any functionality, but it makes it much easier to test the robot. Since a step is only executed in RoboMaker when it has been passed it is impossible for us to execute the Test Tag step right now (this is only a problem in RoboMaker, in the debugger or when published to openkapow.com this is not a problem at all). Add a "Do Nothing" step after Test Tag to solve this little problem. Simply insert a new step after Test Tag and then select the action "Do Nothing" to do this. This new step does nothing at all, but it is a very practical way to make a robot easier to test or better documented.

If we now test the robot in RoboDebugger we see that it works both with good and bad input data!

Part 4 - Run the robot

The robot is tested and ready to get published so we can call it from our RSS Reader to get the latest news about a specific topic from Digg. Start with publishing the robot on openkapow (if you wonder how to do this then read this tutorial), this gives us the URL http://service.openkapow.com/tutorial/rssdigginput.rss though which we can run the robot from the openkapow server.

  If we load this URL in a browser we get to a page where we can specify the value of the "searchText" input value. Here you can run the robot with different input values. Basically what this page does is to put together the complete URL to call the robot with, including the searchText parameter. For example, if you search for "robots" this URL will be called:

http://service.openkapow.com/tutorial/rssdigginput?searchText=robot

If instead you search for Digg stories about Mars you would get this URL:

http://service.openkapow.com/tutorial/rssdigginput?searchText=mars

These URLs can of course be called directly from your RSS Reader instead of going via the page where you can specify the input values.

Since the robot is published on openkapow you can test it out or download it here.

Summary

In this tutorial we have further expanded the Digg RSS robot to be able to take an input value, use this input value to search Digg and then return an RSS feed based on the search result. We also handle the case if no search result is found on Digg. Unfourtunatly this robot is a bit slow to run, and in the next tutorial we are going to take a look on how to increase the performance of this robot. If you are interested in using robots to interact with Digg and create RSS feeds then take a look at the demo Personal Digg RSS Feed and see how this can be done with a combination of robots and javascript.
Published Friday, November 24, 2006 1:53 PM by Andreas

Comments

 

Tutorials : Creating an RSS robot that pages through Digg said:

November 27, 2006 5:19 AM
 

joelmatriche said:

No tutorial is working on this website...

September 1, 2008 1:11 AM
Anonymous comments are disabled
Copyright 2006, 2007 KapowTech.com All Rights Reserved Company | Contact | Terms | Privacy