Google Safe Browsing Library for .NET Beta 1
[Update: Beta 2 of my Google Safe Browsing API implementation is ready. Read more about it here.]
As you may remember from a few ages ago (!), I was working on an implementation of Google Safe Browsing API for .NET as a part of Subkismet project.
For those who are new to this world, Subkismet is a set of tools and libraries for .NET developers to fight against spammers and is founded by Phil Haack, Jeff Atwood (read the comment below) and others.
Unfortunately I couldn't find enough time to finish my work on this implementation on time but as a part of my recent effort for community activities and open source development today I spent a few hours to finalize my work and prepare a Beta version.
Overview
This implementation consists of several parts including a component to perform canonicalization according to RFC2396, a component to perform look ups, a synchronizer class to synchronize local data with Google server and perform verifications and also a data provider to store data.
Earlier I released Beta version of my RFC2396 library for .NET that I developed in a separate project and imported to Subkismet and other parts of the project are developed as a part of original solution.
This implementation heavily relies on two things: regular expressions and unit tests. Thank to unit testing, the implementation could be done much faster and easier and it was one of my most interesting usages of unit tests in a project.
By the way, current version (Beta 1) lacks some features that you may expect:
- Data provider model has not been implemented and there is only an XML provider.
- It doesn't normalize IP addresses.
- Exception handling isn't very powerful.
- Most of my code do not have XML comments (this is inconsistent with the rest of project).
Regardless of these limitations that are easy to fix, I didn't know many phishing or malware sites that are listed by Google to test with this version so there may be some problems with current version, though. Therefore I don't recommend to use it on production but it would be good to test it and let me know about your thoughts and feedbacks.
I just wanted to break this silence and bring the project to a good point so didn't work on abovementioned limitations but am going to release the Beta 2 in next couple of weeks and solve these limitations. Asking from Phil, we may also launch an Alpha or Beta of Subkismet along the Beta 2 as well.
How to Use
There is a Windows Form application project to test Subkismet features that you can use to test my code (there is also an ASP.NET web application project but I haven't added my sample there) but if you're interested to apply the component in your applications, here I show you how to do that.
This library works on two black lists: one for phishing sites and another for malware sites. My implementation uses separate XML files to store data for these two back lists but my library is able to use a single file or single database table to store data. To use the current version you need three string values to pass to the library: your Google Safe Browsing API key that you can obtain from Google site, the path of phishing XML file and the path of malware XML file.
As is obvious from the nature of Google Safe Browsing service, you can use this service in five steps:
- Update your lists to get the latest data.
- Canonicalize your URLs.
- Perform lookup algorithms on canonicalized URLs to get all possible URLs to check.
- Get the MD5 hash value of all possible URLs.
- Verify URLs.
My library saves you from all these steps and automates the process. In your code, first you need to pass parameters to the main class in order to update your back lists and then call appropriate methods and pass your comments to check all the URLs in their body against black lists.
Suppose that I have a Windows form with two buttons. btnUpdate performs an update to get latest data and store them in local black lists. btnCheck verifies all URLs in a comment against data. All inputs are coming from a few TextBox controls.
Here is the code to update local phishing and malware lists. It simply creates an instance of the GoogleSafeBrowsing class and sets three properties for API key, phishing XML file path and malware XML file path and finally call Update method for each list in order to update them. Update returns a Boolean result to let you know if the operation was successful.
private void btnUpdate_Click(object sender, EventArgs e)
{
// Create an instance of the GoogleSafeBrowsing
GoogleSafeBrowsing gsb = new GoogleSafeBrowsing();
// Set three required properties
gsb.ApiKey = txtApiKey.Text;
gsb.PhishingFilePath = txtPhishingPath.Text;
gsb.MalwareFilePath = txtMalwarePath.Text;
// Update phishing black list
gsb.UpdateList(BlackListType.Phishing);
// Update malware black list
gsb.UpdateList(BlackListType.Malware);
}
To check your comments (or URLs) with local black lists, you need to create an instance of the GoogleSafeBrowsing class again and this time call its CheckPost method by passing an IComment parameter and an integer parameter that will be set to the number of bad URLs that could be found in the comment body.
private void btnCheck_Click(object sender, EventArgs e)
{
// Create an instance of the GoogleSafeBrowsing
GoogleSafeBrowsing gsb = new GoogleSafeBrowsing();
// Set three required properties
gsb.ApiKey = txtApiKey.Text;
gsb.PhishingFilePath = txtPhishingPath.Text;
gsb.MalwareFilePath = txtMalwarePath.Text;
int badUrlCount = 0;
// Peform a check and get the number of bad URLs
gsb.CheckPost(GetComment(), out badUrlCount);
// Display the number of bad URLs
MessageBox.Show(string.Format("Bad URLs: {0}", badUrlCount));
}
To get an IComment object, you can create an instance of the Comment object that is a part of Subkismet framework. Following is a simple method to get this instance but note that I didn't set some properties for Comment object because they're not required for Google Safe Browsing library.
private IComment GetComment()
{
// Create an instance of the Comment
Comment comment = new Comment(IPAddress.Parse("192.168.0.1"), "Not required!");
// Google Safe Browsing library only needs the
// comment body and the author's URL is optional
comment.Content = txtContent.Text;
return comment;
}
As the final note you can find Google Safe Browsing library in Subkismet.Services.GoogleSafeBrowsing namespace. Again I'd say that this library is still in Beta 1 and all APIs are subject to change very soon.
Related Information
Here I'd point something about another implementation of Google Safe Browsing API with C# by Mark McAvoy who contacted and informed me about his implementation three months ago. You can check out his library here. However, I kept working on mine since we needed an integrated library that be suitable for online scenarios.
Download
Beta 1 version of my Google Safe Browsing API implementation for .NET is available as a part of source code for our Subkismet project and you can get latest version to use it. Note that I've upgraded the whole solution to Visual Studio 2008 two weeks ago so you need to have it installed in order to check the source. Hopefully I'll release Beta 2 which should be more stable and easier to use very soon.
[advertisement] Axosoft OnTime 2008 is four developer tools in one: bug tracking, project wiki, feature management, and help desk. It manages your development process so developers can focus on coding. Installed or Hosted – Free Single-user license -- Free 30-day team trial.
4 Comments : 12.21.07
Feedbacks
Thank you, Phil :-)
Hi Keyvan,
Small correction-- I'm not involved in Subkismet, that I know of..
Jeff
My fault, Jeff :-)
I thought that CAPTCHA control is done by you.
However, I corrected my post.
#1
Haacked
12.21.2007 @ 11:51 AM
Very cool! Great work Keyvan! I can't wait to try it out.