I'm Keyvan Nayyeri, a 25 years old Ph.D. student at
the Computer Science department of
the University of Texas at San Antonio.
I'm also
a Software Architect and Developer and previously held a B.Sc.
degree in Applied Mathematics.
This is my blog where I publish content about various topics specifically Programming Languages and Compilers, Software
Engineering, and Programming.
Fahrenheit Marketing is a top-dog Austin Web Design firm offering a complete portfolio of online services.
One of the ongoing trends on the .NET community for the past years of its existence has been to import many of the famous and helpful Java projects to the .NET Framework. The main reason is that Open Source is more common and older on the Java community and the .NET community has been wanting to get its hands on the rich tools and libraries created for the Java in the shortest time possible without spending much time recreating the same stuff.
Besides, there have been software projects, teams, and companies trying to migrate from Java to .NET. This has motivated many companies and Open Source projects to write code converters that get the source code in Java and produce the equivalent code in .NET languages such as C# or Visual Basic. This is feasible mainly due to the many similarities between these two platforms and their underlying structure and APIs. Although there have been very good products and tools released for this purpose, there are always some problems for real world code that should be fixed manually and the power of these tools is to reduce the amount of work needed to be done by hand.
This problem has encouraged some Computer Scientists to work on a paper that focuses on this area to improve the quality of code conversion between languages. The outcome of their work was published in ICSE 2010 as a paper entitled Mining API Mapping for Language Migration that we discussed recently at our department.
This paper introduces the idea of using the previously translated source code from Java to .NET to create a mapping between APIs in both platforms which is similar to a learning process for the system. Later when trying to translate a code from Java to .NET, the system can use this mapping history to convert the source code with less problems. They introduce their new approach as Mining API Mapping (MAM) which consists of three main steps:
The Chinese authors apply this to a simple code to exemplify the approach and then provide the results of their evaluations in terms of numbers and percentages that show some improvements. They use some famous projects that were previously converted from Java to .NET to feed their system. There are some famous projects like Hibernate/NHibernate, Lucene/Lucene.NET, and Log4j/Log4net included in this experiment. Having the API mappings from this training, they applied their approach to a few projects and compare the quality of their translated code with the outcome of Java2CSharp tool that they claim to be one of the best tools available for this purpose.
While this approach can make some improvements to this field, there are some challenges to the technique and its evaluations. The experiments are done on some projects that are already converted to the .NET platform using an automated tool. To my knowledge, Lucene.NET and Log4net are both imported with heavy use of automated conversion tools to the .NET platform. This can affect the reliability of the experiment. Also Such an approach can be applied to some projects that don't have dependencies on any third party code because it cannot create mappings for such dependencies. This limits the number of scenarios in which this technique can be applied to. Furthermore, there aren't good metrics used to compare the results. The best metric used is the differences between the number of compiler errors in this new tool and Java2CSharp, but in my experience, in such conversion cases one of the most common problems is the code that is running but doesn't provide the expected output. Additionally, his technique is very limited in scope and can't be generalized to conversions between other languages. The only case where it can work is between the Java and .NET and I doubt if it even works with the same quality in the reverse direction.
All in all, this approach is a good technique to improve the quality of automated code conversions between Java and C# in specific cases, and can be adapted by different tools.
Following our weekly meetings to discuss the papers published in ICSE 2010, today it was my turn again to present another paper entitled Code Bubbles - Rethinking the User Interface Paradigm of Integrated Development Environments. My first presentation was on Software Traceability with Topic Modeling.
Code Bubbles is the name of a prototype IDE designed with the purpose of changing the User Interface design and user experience for Integrated Development Environments. It applies the concept of bubble metaphor to implement the idea of concurrent view of multiple code fragments to improve the productivity, user experience, and attractiveness of IDEs for developers.
This IDE is built using Windows Presentation Foundation (WPF) as the frond end User Interface framework and Eclipse as the backend tool, and the front end and back end communicate some call backs, so Code Bubbles works based on the powerful and extensible Eclipse platform.
The Bubbles IDE relies on the concept of bubble metaphor to replace the traditional windows/tab-based user interface for IDEs (e.g. Visual Studio, Eclipse, or Xcode) that don’t allow more than a singular view of a code fragment. With the new approach, you can put multiple bubbles on a big canvas that can open different files or different portions of a single file at the same time to be viewed concurrently by the user.
The authors of the paper have done a set of quantitative and qualitative analysis on the effectiveness of the IDE. Using the quantitative analysis based on 3 projects, 3 test cases, and 3 metrics they show that in most of the cases Code Bubbles IDE performs better than the Eclipse IDE meaning that it lets the developer view more code at the same time with the least number of User Interface operations to be performed. Doing the qualitative analysis with 23 developers with an average of 10 years of experience with industrial Java programming, they get very positive feedback on the relevance of the IDE.
In my opinion this IDE can perform much better than traditional IDEs especially for reviewing and debugging code but it may not be very advantageous for other scenarios especially for writing a new file. Besides, Code Bubbles IDE requires higher-level hardware including multi-core CPUs, graphic cards, and bigger monitors which is basically caused by the use of WPF technology. This is not a very big problem nowadays, though, because such hardware is becoming common.
Although Code Bubbles is designed for scientific experiments, it’s not available for public download, but you’re able to get involved in the Beta program to get your hands on it and help the research team as well. If you’re interested to see the user experience that you get with the Bubbles IDE, you can view this video of a common scenario performed on it.
For me more important than the paper itself was the presentation because I had spent much time in the past few months learning about good presentation skills to revolutionize my presentations and this was the first presentation where I applied my new skills and experiences to deliver a different presentation. It wouldn’t be easy to get much out of my slides since they are designed to complement my talk but you can read the paper to get an idea about the contributions of this paper. Of course, I could enhance my current slides by using images but I didn’t want to spend money on buying images for this small presentation.
I have uploaded my slides here so you can download and view them.
Five years is not a short while that you can ignore and some old readers may be surprised if I say that my blog is turning five today! I started blogging here on 28 June 2005 and have been doing that for five years. It may be interesting to go back and read my anniversary posts for 2006, 2007, 2008, and 2009. At least, they were very interesting for myself!
Such an age for a blog is a sign of maturity and I think that I can easily assert that my blog has become mature enough in the past five years. There have been many changes both in my life and on this blog that have made it very different from the early days both in regards to look and feel and the content.
To have a quick overview of the past five years, I have to say that I started blogging with focusing on republishing the news and technical stuff in the first year, then started to write technical posts, articles, and tutorials about the current trends on the .NET community in the second year. Starting in the third year, I was writing content about new technologies that were mostly in early stages of development and my content became one of the most influential resources about those technologies. Since then I started to have a leading role on the community as well.
The past year was very different in several ways. First, I had the lowest number of blog posts which comes from a few facts that I explain later. Second, I moved to the United States and started my Ph.D. program in Computer Science with new concerns. Third, I engaged more in technical talks and presentations that were eating my time for blogging. Fourth, I recently resigned from the Microsoft community to spend my time and effort on greater good.
All these factors affected this blog and changed it a lot. My posts were mostly about more advanced stuff and I started to publish content about academic and non-Microsoft technologies. For the first time I wrote about advanced Software Engineering concepts, iPhone programming, and Scala that widened the audience of this blog to some extent.
Unlike the past few years, I only have a few major hits for this year:
As of the reasons for my lower blogging activity, I’d say that my interests are changed over time and I don’t have such a burning passion for blogging anymore. I have great memories from blogging here in the past but blogging doesn’t seem to be very exciting anymore. Besides, Twitter has replaced blogging in many cases and I spend more time publishing quick content over there. Additionally, I’d love to publish more noble content about more advanced stuff and develop those ideas, so I prefer to focus more on them even if there are quite a few posts published per month.
All in all, less or more I’d expect to have the same level of blogging activity over the next year as well. Many of the blogs that were alive back in 2005 are either dead or have the same activity as mine. It’s a great feeling to be able to maintain a blog within five sensitive years of your life under very difficult circumstances varying from my undergraduate studies to my military service, and now my Doctorate studies.
In the end, I have to thank all my old and new readers for following this blog and helping me with their feedback.
Following our weekly meetings to present and discuss ICSE 2010 papers on Software Engineering, a few weeks ago we talked about another paper entitled Model Checking Lots of Systems - Efficient Verification of Temporal Properties in Software Product Lines.
In industrial Software Development you may face with mass production of different derivations of a software system customized for the end user’s needs. From a standpoint, a software system is a set of major and minor features that are included and based on the circumstance, a user may or may not want some of these features. For example, the core functionality and features of Adobe Photoshop are the same (and constant) but you see different editions customized for a specific user’s requirements.
Software systems in industry are developed in families and the difference between family members is in the features. The number of systems developed in this way are growing, and modeling and verification of such systems have become vital tasks. Obviously, it’s arduous to model and verify such systems by hand because as the number of features increases, the number of possible combinations grows exponentially.
A Software Product Line (SPL) is the term that refers to the set of software systems that share a common set of features and Software Product Line Engineering (SPLE) is the engineered knowledge of reusing the features to help the economy of massive production of these software.
In this paper a group of European authors introduce a new approach for model checking the systems produced massively in industry and evaluate their approach with experimental results. First they provide a simple motivation example about a beverage vending machine that at a very high level gets the money, return change, and serves soda. They develop the example to show some other possibilities for the system such as serving tea, cancellation of a purchase, and distributing soda for free.
After describing the problem that is being solved by this paper, the authors discuss the challenges for their work and the contributions that they make, then jump to discussing the base concepts and definitions that they use throughout the paper such as Feature Diagram (FD) and Transition System (TS), Finite Automaton (FA), and Featured Transition System (FTS). Essentially, they model a system using a tuple of different sets that represent the main components appearing in the system. This helps them use mathematical notation to analyze the behavior of the system.
I skip the mathematical discussion as it’s beyond the scope of this blog and hard to explain for my regular readers but the outcome of this modeling (that you can read in the original paper) are some techniques and algorithms to automatically check the systems.
In the end, this foundation is used to implement their method and have an empirical evaluation. The FTS model checking technique is applied to a Haskell interpreter as a functional programming language and is used to check a mine pump controller exemplar, and the results of this experiment are provided to show the effectiveness of this introduced technique.
Based on the discussion that we had, this technique is in early development and has a long way to become mature. This paper (like many other scientific papers) lacks a thorough experimental analysis and the only example provided doesn’t convince the reader of the power and effectiveness of the approach.
After our first meeting to discuss ICSE 2010 papers with my presentation on Software Traceability with Topic Modeling, yesterday we had our second presentation on another paper entitled a Search Engine for Finding Highly Relevant Applications. The implementation of the idea introduced in this paper (that I’ll describe shortly in this post) is available on the web and is called Exemplar code search engine.
As you would already know, Code Reusability is one of the main principles of Software Development and an important aspect of Object-Oriented Programming. Software developers try to reuse components or pieces of code in their programs in order to speed up the process and reduce the costs. Besides, code reusability can help improve the quality of code by focusing on better design and implementation of smaller components.
As a common part of daily programming for industrial Software Developers, they try to search for relevant components, libraries, or code snippets to use in their projects. They often search for their needs on code search engines like SourceForge, Google Code, Koders, CodePlex, and many other services.
Most of these code search engines rely heavily on some textual values entered by project coordinators on the websites such as the title, description, category, tag, or some other attributes.
However, there is a common problem in using these search engines and that is the relevance of search results because it depends on two major parameters: the careful selection of keywords and the richness of the textual parameters entered by project owners. The first parameter is something that can easily be resolved only by better training of users, but for the second parameter there are some difficulties. Whatever you enter for a project even something very rich, still there may be some parts of the project missing from the project codebase especially for bigger projects that consist of various components.
There have been some attempts to solve this issue with different techniques. The paper that we discussed and is recently published at ICSE 2010 tries to provide an improvement in this area. This technique consists of not only searching in the textual properties of a project on a repository, but also on the relationships between the project APIs based on the help documents written for the project.
In this paper, authors have tried to apply this idea using two approaches: a pure search in the help documents for project APIs, and an advanced search in API documents based on the Data Flow analysis of the API.
In order to implement this idea, the authors have aggregated around 30,000 Java projects on SourceForge, processed their APIs with the abovementioned approaches, and published this code search engine, called Exemplar, on the web. Then they asked a group of 39 Java developers with different levels of experience to search for some common programming tasks using this search engine under a time limit. In the next step they asked the developers to evaluate the results and rank the relevance of them as well as their own confidence in their answers.
This experiment is done using statistical methods and the authors have provided the results which reflects the fact that using the API descriptions improves the relevance of search results, but the use of Data Flow analysis doesn’t have a big impact.
However, it appears that there isn’t enough work done in the area of Data Flow analysis, and the implementation is weak and superficial. It seems that authors agree with this fact because they talk about their future work in this area to have a stronger implementation of Data Flow analysis to improve the relevance of search results.
All in all, I think that this new approach has a good potential to improve the search results on code search engines, but a higher level implementation of Data Flow analysis would be costly and much work will be needed in fine-tuning of the search engine in this area.