I'm Keyvan Nayyeri, a 25 years old Ph.D. student at
the Computer Science department of
the University of Texas at San Antonio.
I'm also
a Software Architect and Developer and previously held a B.Sc.
degree in Applied Mathematics.
This is my blog where I publish content about various topics specifically Programming Languages and Compilers, Software
Engineering and Programming.
[Update: Beta 2 of my Google Safe Browsing API implementation is ready. Read more about it here.]
As you may remember from a few ages ago (!), I was working on an implementation of Google Safe Browsing API for .NET as a part of Subkismet project.
For those who are new to this world, Subkismet is a set of tools and libraries for .NET developers to fight against spammers and is founded by Phil Haack, Jeff Atwood (read the comment below) and others.
Unfortunately I couldn't find enough time to finish my work on this implementation on time but as a part of my recent effort for community activities and open source development today I spent a few hours to finalize my work and prepare a Beta version.
This implementation consists of several parts including a component to perform canonicalization according to RFC2396, a component to perform look ups, a synchronizer class to synchronize local data with Google server and perform verifications and also a data provider to store data.
Earlier I released Beta version of my RFC2396 library for .NET that I developed in a separate project and imported to Subkismet and other parts of the project are developed as a part of original solution.
This implementation heavily relies on two things: regular expressions and unit tests. Thank to unit testing, the implementation could be done much faster and easier and it was one of my most interesting usages of unit tests in a project.
By the way, current version (Beta 1) lacks some features that you may expect:
Regardless of these limitations that are easy to fix, I didn't know many phishing or malware sites that are listed by Google to test with this version so there may be some problems with current version, though. Therefore I don't recommend to use it on production but it would be good to test it and let me know about your thoughts and feedbacks.
I just wanted to break this silence and bring the project to a good point so didn't work on abovementioned limitations but am going to release the Beta 2 in next couple of weeks and solve these limitations. Asking from Phil, we may also launch an Alpha or Beta of Subkismet along the Beta 2 as well.
There is a Windows Form application project to test Subkismet features that you can use to test my code (there is also an ASP.NET web application project but I haven't added my sample there) but if you're interested to apply the component in your applications, here I show you how to do that.
This library works on two black lists: one for phishing sites and another for malware sites. My implementation uses separate XML files to store data for these two back lists but my library is able to use a single file or single database table to store data. To use the current version you need three string values to pass to the library: your Google Safe Browsing API key that you can obtain from Google site, the path of phishing XML file and the path of malware XML file.
As is obvious from the nature of Google Safe Browsing service, you can use this service in five steps:
My library saves you from all these steps and automates the process. In your code, first you need to pass parameters to the main class in order to update your back lists and then call appropriate methods and pass your comments to check all the URLs in their body against black lists.
Suppose that I have a Windows form with two buttons. btnUpdate performs an update to get latest data and store them in local black lists. btnCheck verifies all URLs in a comment against data. All inputs are coming from a few TextBox controls.
Here is the code to update local phishing and malware lists. It simply creates an instance of the GoogleSafeBrowsing class and sets three properties for API key, phishing XML file path and malware XML file path and finally call Update method for each list in order to update them. Update returns a Boolean result to let you know if the operation was successful.
private void btnUpdate_Click(object sender, EventArgs e)
{
// Create an instance of the GoogleSafeBrowsing
GoogleSafeBrowsing gsb = new GoogleSafeBrowsing();
// Set three required properties
gsb.ApiKey = txtApiKey.Text;
gsb.PhishingFilePath = txtPhishingPath.Text;
gsb.MalwareFilePath = txtMalwarePath.Text;
// Update phishing black list
gsb.UpdateList(BlackListType.Phishing);
// Update malware black list
gsb.UpdateList(BlackListType.Malware);
}
To check your comments (or URLs) with local black lists, you need to create an instance of the GoogleSafeBrowsing class again and this time call its CheckPost method by passing an IComment parameter and an integer parameter that will be set to the number of bad URLs that could be found in the comment body.
private void btnCheck_Click(object sender, EventArgs e)
{
// Create an instance of the GoogleSafeBrowsing
GoogleSafeBrowsing gsb = new GoogleSafeBrowsing();
// Set three required properties
gsb.ApiKey = txtApiKey.Text;
gsb.PhishingFilePath = txtPhishingPath.Text;
gsb.MalwareFilePath = txtMalwarePath.Text;
int badUrlCount = 0;
// Peform a check and get the number of bad URLs
gsb.CheckPost(GetComment(), out badUrlCount);
// Display the number of bad URLs
MessageBox.Show(string.Format("Bad URLs: {0}", badUrlCount));
}
To get an IComment object, you can create an instance of the Comment object that is a part of Subkismet framework. Following is a simple method to get this instance but note that I didn't set some properties for Comment object because they're not required for Google Safe Browsing library.
private IComment GetComment()
{
// Create an instance of the Comment
Comment comment = new Comment(IPAddress.Parse("192.168.0.1"), "Not required!");
// Google Safe Browsing library only needs the
// comment body and the author's URL is optional
comment.Content = txtContent.Text;
return comment;
}
As the final note you can find Google Safe Browsing library in Subkismet.Services.GoogleSafeBrowsing namespace. Again I'd say that this library is still in Beta 1 and all APIs are subject to change very soon.
Here I'd point something about another implementation of Google Safe Browsing API with C# by Mark McAvoy who contacted and informed me about his implementation three months ago. You can check out his library here. However, I kept working on mine since we needed an integrated library that be suitable for online scenarios.
Beta 1 version of my Google Safe Browsing API implementation for .NET is available as a part of source code for our Subkismet project and you can get latest version to use it. Note that I've upgraded the whole solution to Visual Studio 2008 two weeks ago so you need to have it installed in order to check the source. Hopefully I'll release Beta 2 which should be more stable and easier to use very soon.
Haacked
Dec 21, 2007 11:51 AM
#
Very cool! Great work Keyvan! I can't wait to try it out.
Keyvan Nayyeri
Dec 21, 2007 11:55 AM
#
Thank you, Phil :-)
Jeff Atwood
Dec 22, 2007 2:37 AM
#
Hi Keyvan,
Small correction-- I'm not involved in Subkismet, that I know of..
Jeff
Keyvan Nayyeri
Dec 22, 2007 8:02 AM
#
My fault, Jeff :-)
I thought that CAPTCHA control is done by you.
However, I corrected my post.
Leave a Comment