Keyvan Nayyeri

Musings of a Ph.D. student in Computer Science

Program into Your Language, Not in It

Photo taken from http://www.accra.ca After starting my Ph.D. program in Computer Science and engaging in active research in the field of Programming Languages and Compilers, I had to learn several new languages and discover the similarities and differences between programming languages, also about the fundamentals of Programming Language design. In the first three months of my graduate program I learned and taught Scala, Objective-C, Lisp, Scheme, and ML (besides C/C++ and Java that I knew already).

One important question that comes to mind after learning several languages and using them is that which one is important and why we should bother learning new languages? I think that I learned and used over 20 programming languages in my life actively (those that I could remember and count) and there was always something new and helpful in a language to use for me.

There has been a lot of debates among programmers on choosing between different programming languages such as Java vs. C#, C# vs. Visual Basic, or C vs. C++. It’s even important enough to write books for or in different languages as some programming books are written separately for different languages (e.g. Wrox has different books for C# and Visual Basic or provides codes in both languages).

But one of the most important points neglected by many people is that in many cases such discussions are pointless. Many of the languages exhibit very similar structures and features that make it really easy to choose one of them. The differences in programming languages come in two major ways: paradigm and domain.

As of the programming paradigms, most of the common programming languages can be categorized as imperative/Object-Oriented or functional. When it comes to choosing a language, there exists a clear line for the programmers to decide whether they want to use an imperative language or a functional one. For the domain, the line is even clearer because it’s easy to split the languages into the group of general and specific domain languages and distinguish them, and again, it’s easy for everyone to decide which one he needs.

Therefore, in most cases it’s just a matter of selecting a language in the same paradigm and domain, and as it’s more common in industry these days, it’s a matter of choosing between a few structured languages including C#, Visual Basic, and Java.

The most notable and influential discussion on this topic is written by Steve McConnell in chapter 34 of his Software Development bible, Code Complete, where he has a section entitled “Program into Your Language, Not in It”. I also had cited this section in my Professional Visual Studio Extensibility book where I was outlining my reasons for choosing C# as the main language of the book and ignoring Visual Basic to save paper.

In essence, Steve states that when choosing a programming language, using the most obvious paths to solve a problem leads to programming in a language rather than into it. You should think about your goals and choose the best strategy to achieve that by programming into your language. So if your language doesn’t support a feature, you can write your own implementation for that even if it’s not as rich as something built into the language. At some extreme, programming into a language may lead to some difficulties when you want to apply your approach to a language that doesn’t have a good support for that approach.

I personally experienced this paradigm in the past months as I was compelled to use different programming languages. In the past years I was a professional .NET developer using C# (and sometimes Visual Basic) as my primary language and I had to use Java and some other languages to accomplish a task that was easier to achieve in the .NET languages. Luckily, I was programming into the .NET languages not in them, so I found it straightforward to switch to other platforms and languages even though I still haven’t earned the same level of proficiency that I had in .NET with Java because it takes time to master all the libraries in a language.

That being said, I wanted to leave this advice for the readers who are trying to start programming with a platform professionally to take this point into account that at some point they need to switch between languages and platforms and such a transition won’t be easy for them unless they have tried to program into their language not in it. These days many of the .NET or Java community leaders have switched to Python and Ruby, and are using them for their career; therefore, probably it will be a need for many of the programmers to go through such big changes in their professional life in the future, too.

4 Comments

Mining API Mapping for Language Migration

One of the ongoing trends on the .NET community for the past years of its existence has been to import many of the famous and helpful Java projects to the .NET Framework. The main reason is that Open Source is more common and older on the Java community and the .NET community has been wanting to get its hands on the rich tools and libraries created for the Java in the shortest time possible without spending much time recreating the same stuff.

Besides, there have been software projects, teams, and companies trying to migrate from Java to .NET. This has motivated many companies and Open Source projects to write code converters that get the source code in Java and produce the equivalent code in .NET languages such as C# or Visual Basic. This is feasible mainly due to the many similarities between these two platforms and their underlying structure and APIs. Although there have been very good products and tools released for this purpose, there are always some problems for real world code that should be fixed manually and the power of these tools is to reduce the amount of work needed to be done by hand.

This problem has encouraged some Computer Scientists to work on a paper that focuses on this area to improve the quality of code conversion between languages. The outcome of their work was published in ICSE 2010 as a paper entitled Mining API Mapping for Language Migration that we discussed recently at our department.

This paper introduces the idea of using the previously translated source code from Java to .NET to create a mapping between APIs in both platforms which is similar to a learning process for the system. Later when trying to translate a code from Java to .NET, the system can use this mapping history to convert the source code with less problems. They introduce their new approach as Mining API Mapping (MAM) which consists of three main steps:

  • Aligning the code in both versions of code in two platforms
  • Mining the API mapping between classes
  • Mining the API mapping between methods

The Chinese authors apply this to a simple code to exemplify the approach and then provide the results of their evaluations in terms of numbers and percentages that show some improvements. They use some famous projects that were previously converted from Java to .NET to feed their system. There are some famous projects like Hibernate/NHibernate, Lucene/Lucene.NET, and Log4j/Log4net included in this experiment. Having the API mappings from this training, they applied their approach to a few projects and compare the quality of their translated code with the outcome of Java2CSharp tool that they claim to be one of the best tools available for this purpose.

While this approach can make some improvements to this field, there are some challenges to the technique and its evaluations. The experiments are done on some projects that are already converted to the .NET platform using an automated tool. To my knowledge, Lucene.NET and Log4net are both imported with heavy use of automated conversion tools to the .NET platform. This can affect the reliability of the experiment. Also Such an approach can be applied to some projects that don't have dependencies on any third party code because it cannot create mappings for such dependencies. This limits the number of scenarios in which this technique can be applied to. Furthermore, there aren't good metrics used to compare the results. The best metric used is the differences between the number of compiler errors in this new tool and Java2CSharp, but in my experience, in such conversion cases one of the most common problems is the code that is running but doesn't provide the expected output. Additionally, his technique is very limited in scope and can't be generalized to conversions between other languages. The only case where it can work is between the Java and .NET and I doubt if it even works with the same quality in the reverse direction.

All in all, this approach is a good technique to improve the quality of automated code conversions between Java and C# in specific cases, and can be adapted by different tools.

0 Comments

Slides of My Presentation on Code Bubbles

Following our weekly meetings to discuss the papers published in ICSE 2010, today it was my turn again to present another paper entitled Code Bubbles - Rethinking the User Interface Paradigm of Integrated Development Environments. My first presentation was on Software Traceability with Topic Modeling.

Code Bubbles is the name of a prototype IDE designed with the purpose of changing the User Interface design and user experience for Integrated Development Environments. It applies the concept of bubble metaphor to implement the idea of concurrent view of multiple code fragments to improve the productivity, user experience, and attractiveness of IDEs for developers.

This IDE is built using Windows Presentation Foundation (WPF) as the frond end User Interface framework and Eclipse as the backend tool, and the front end and back end communicate some call backs, so Code Bubbles works based on the powerful and extensible Eclipse platform.

The Bubbles IDE relies on the concept of bubble metaphor to replace the traditional windows/tab-based user interface for IDEs (e.g. Visual Studio, Eclipse, or Xcode) that don’t allow more than a singular view of a code fragment. With the new approach, you can put multiple bubbles on a big canvas that can open different files or different portions of a single file at the same time to be viewed concurrently by the user.

The authors of the paper have done a set of quantitative and qualitative analysis on the effectiveness of the IDE. Using the quantitative analysis based on 3 projects, 3 test cases, and 3 metrics they show that in most of the cases Code Bubbles IDE performs better than the Eclipse IDE meaning that it lets the developer view more code at the same time with the least number of User Interface operations to be performed. Doing the qualitative analysis with 23 developers with an average of 10 years of experience with industrial Java programming, they get very positive feedback on the relevance of the IDE.

In my opinion this IDE can perform much better than traditional IDEs especially for reviewing and debugging code but it may not be very advantageous for other scenarios especially for writing a new file. Besides, Code Bubbles IDE requires higher-level hardware including multi-core CPUs, graphic cards, and bigger monitors which is basically caused by the use of WPF technology. This is not a very big problem nowadays, though, because such hardware is becoming common.

Although Code Bubbles is designed for scientific experiments, it’s not available for public download, but you’re able to get involved in the Beta program to get your hands on it and help the research team as well. If you’re interested to see the user experience that you get with the Bubbles IDE, you can view this video of a common scenario performed on it.

For me more important than the paper itself was the presentation because I had spent much time in the past few months learning about good presentation skills to revolutionize my presentations and this was the first presentation where I applied my new skills and experiences to deliver a different presentation. It wouldn’t be easy to get much out of my slides since they are designed to complement my talk but you can read the paper to get an idea about the contributions of this paper. Of course, I could enhance my current slides by using images but I didn’t want to spend money on buying images for this small presentation.

I have uploaded my slides here so you can download and view them.

0 Comments

Nayyeri.NET Turns Five

Five years is not a short while that you can ignore and some old readers may be surprised if I say that my blog is turning five today! I started blogging here on 28 June 2005 and have been doing that for five years. It may be interesting to go back and read my anniversary posts for 2006, 2007, 2008, and 2009. At least, they were very interesting for myself!

Such an age for a blog is a sign of maturity and I think that I can easily assert that my blog has become mature enough in the past five years. There have been many changes both in my life and on this blog that have made it very different from the early days both in regards to look and feel and the content.

To have a quick overview of the past five years, I have to say that I started blogging with focusing on republishing the news and technical stuff in the first year, then started to write technical posts, articles, and tutorials about the current trends on the .NET community in the second year. Starting in the third year, I was writing content about new technologies that were mostly in early stages of development and my content became one of the most influential resources about those technologies. Since then I started to have a leading role on the community as well.

The past year was very different in several ways. First, I had the lowest number of blog posts which comes from a few facts that I explain later. Second, I moved to the United States and started my Ph.D. program in Computer Science with new concerns. Third, I engaged more in technical talks and presentations that were eating my time for blogging. Fourth, I recently resigned from the Microsoft community to spend my time and effort on greater good.

All these factors affected this blog and changed it a lot. My posts were mostly about more advanced stuff and I started to publish content about academic and non-Microsoft technologies. For the first time I wrote about advanced Software Engineering concepts, iPhone programming, and Scala that widened the audience of this blog to some extent.

Unlike the past few years, I only have a few major hits for this year:

  • Migrated to Behistun with a fresh new theme (September 19, 2009): After over four years and using two Telligent products, Community Server and Graffiti CMS, I finally made the move by writing my own blog engine with ASP.NET MVC and reloaded my blog with a fresh and simple template.
  • Left Iran and came to the US (December 11, 2009): In early December 2009 I left Iran to come here to the United States to pursue my Doctorate degree in Computer Science. Obviously, this was a big change that affected my blog as well.
  • Resigned from the .NET community (May 17, 2010): Having several reasons outlined, I made my mind to leave the .NET community and find my future in better places because I couldn’t see any good future for that community. What I’ve witnessed in this short while has proven me to be true. This was an important change because a big portion of my past blog posts were about Microsoft technologies.

As of the reasons for my lower blogging activity, I’d say that my interests are changed over time and I don’t have such a burning passion for blogging anymore. I have great memories from blogging here in the past but blogging doesn’t seem to be very exciting anymore. Besides, Twitter has replaced blogging in many cases and I spend more time publishing quick content over there. Additionally, I’d love to publish more noble content about more advanced stuff and develop those ideas, so I prefer to focus more on them even if there are quite a few posts published per month.

All in all, less or more I’d expect to have the same level of blogging activity over the next year as well. Many of the blogs that were alive back in 2005 are either dead or have the same activity as mine. It’s a great feeling to be able to maintain a blog within five sensitive years of your life under very difficult circumstances varying from my undergraduate studies to my military service, and now my Doctorate studies.

In the end, I have to thank all my old and new readers for following this blog and helping me with their feedback.

7 Comments

Model Checking Lots of Systems

Following our weekly meetings to present and discuss ICSE 2010 papers on Software Engineering, a few weeks ago we talked about another paper entitled Model Checking Lots of Systems - Efficient Verification of Temporal Properties in Software Product Lines.

In industrial Software Development you may face with mass production of different derivations of a software system customized for the end user’s needs. From a standpoint, a software system is a set of major and minor features that are included and based on the circumstance, a user may or may not want some of these features. For example, the core functionality and features of Adobe Photoshop are the same (and constant) but you see different editions customized for a specific user’s requirements.

Software systems in industry are developed in families and the difference between family members is in the features. The number of systems developed in this way are growing, and modeling and verification of such systems have become vital tasks. Obviously, it’s arduous to model and verify such systems by hand because as the number of features increases, the number of possible combinations grows exponentially.

A Software Product Line (SPL) is the term that refers to the set of software systems that share a common set of features and Software Product Line Engineering (SPLE) is the engineered knowledge of reusing the features to help the economy of massive production of these software.

In this paper a group of European authors introduce a new approach for model checking the systems produced massively in industry and evaluate their approach with experimental results. First they provide a simple motivation example about a beverage vending machine that at a very high level gets the money, return change, and serves soda. They develop the example to show some other possibilities for the system such as serving tea, cancellation of a purchase, and distributing soda for free.

After describing the problem that is being solved by this paper, the authors discuss the challenges for their work and the contributions that they make, then jump to discussing the base concepts and definitions that they use throughout the paper such as Feature Diagram (FD) and Transition System (TS), Finite Automaton (FA), and Featured Transition System (FTS). Essentially, they model a system using a tuple of different sets that represent the main components appearing in the system. This helps them use mathematical notation to analyze the behavior of the system.

I skip the mathematical discussion as it’s beyond the scope of this blog and hard to explain for my regular readers but the outcome of this modeling (that you can read in the original paper) are some techniques and algorithms to automatically check the systems.

In the end, this foundation is used to implement their method and have an empirical evaluation. The FTS model checking technique is applied to a Haskell interpreter as a functional programming language and is used to check a mine pump controller exemplar, and the results of this experiment are provided to show the effectiveness of this introduced technique.

Based on the discussion that we had, this technique is in early development and has a long way to become mature. This paper (like many other scientific papers) lacks a thorough experimental analysis and the only example provided doesn’t convince the reader of the power and effectiveness of the approach.

0 Comments