Huge performance boost for EGit sync-view

For last few days I was working on EGit Synchronize View performance, especially on Workspace presentation model (the Git Change Set is next on my list ;)). My starting point was 1m 40s to compare two linux kernel versions, v2.6.36 versus v2.6.38-rc2. Such result seams to be very good, but you need to know that it was achived on SSD hard drive and comparing to regular HDD it would be much worst (maybe more then 3 or 4 minutes).

What can be improved ? Fist of all, current implementation whenever it is asked for members of particular folder it opens repository* and read data directly from it. Of course there is a cashing mechanism, but it is only on single resource level. Therefore when you are launching synchronization on repository that have about 300 folders, current implementation will create and configure 300 connections* to repository to read data and then will cache it.

So, my idea was to create only one connection*, read all data at once and put them into global cache. This cache will be used whenever any list of members for given folder will be required. This approach gives about 2.5x performance boost to synchronization (from 1m 40s down to 40s). This result looks much better and maybe on HDD this action will take less then 2 minutes … but this isn’t over 😉

Reading members of folder is one thing, but getting information about particular file (it is changed, added or removed and does this change is incoming, outgoing or conflicting) is another. Currently we are reusing default implementation of SyncInfo class from Team Framework. This is really good implementation … when you cannot obtain such information from version control system. In Git  we have SHA-1 for each file and folder version and we didn’t have to compare file contents to check they are similar or not, comparing SHA-1 is sufficient. This should save lots of CPU time, disk IO’s and developer time waiting for synchronization to finish ;).

Now when I already have cache that contains list of all changed resources it was natural thing to add information about change type to it. Then whenever Team Framework need to know change type it can be easily obtained from this cache … no IO’s are needed, no comparison just read from in-memory-cache and return proper value.

I’m sure that you are wondering how fast synchronization can be now … I can only that it is REALLY fast … as you can remember my stating point was 1m 40s, now same comparison will finish in less than 7s!! This means that now synchronization will be 14 times faster then before! What this means for a regular user? Well, it meas that you will get results of ‘Synchronize Workspace‘ action almost instantly.

Unfortunately, mentioned above changes are sill awaiting for review in gerrit, you can grab them from change #3891 and build it locally. I hope this will be included in 1.1 release …

* jgit uses concept of walks (with filters) through repository, but I’ve used more commonly recognized terminology here

Java7 Launch Party @Szczecin

Java7 T-Shirt frontJava7 T-Shirt backSzczecin Java User Group is organizing a Launch Party event for JDK7. There would be a short introduction of new features in Java lanuage and API’s presented by Filip ‘Filus’ Pająk. Filip will be also speaking about Java7 syntax support in popular IDE’s like InteliJ Idea and Eclipse IDE.

As usually on ours meeting there would be drawing of licenses for JRebel and InteliJ Idea. This time we have also cool Java7 T-Shirts for all attendees.

Registration isn’t required but it would be nice if you can join this event on facebook (here you can also find mode details about this event).

Eclipse DemoCamp Poznan 2011

Same as year ago, I’ll be presenting some new features in Eclipse on DemoCamp Poznan.

This time I’ll be mainly speaking about code review and how this process can be handled from Eclipse. Everything will be based on Gerrit Code review, I’ll also show Gerrit Jenkins/Hudson integration by Gerrit Trigger, another used project will be mylyn-reviews and last but not least will be (of course) EGit tadalafil 20mg price. All those projects will be mixed up together to give an quick overview of Code Review 2.0

If you are interested in, please “register” on event wiki page.

See you in Poznan!

 


 

 

Here you can find my presentation slides.

EGit Synchronize View Workflow Updates

One part of my Google Summer of Code project is to improve EGit Synchronization workflow and make it easier for new user to understood how it works.

So almost two weeks ago I’ve write a post with proposed new workflow for Synchronize Wizard. The main idea of that post was to initialize discussion with community how they want to use this wizard. According to Google Analytics this post was displayed around 120 times with isn’t that bad I think … but only one person leave a comment on it with isn’t a good result. Maybe I’m doing something wrong or I’m not making myself clear enough. Or maybe this topic isn’t so important for other people … I don’t know, maybe you can help me and give me some hints?

Apart from this, here is another part of changes that are proposed to be done in Synchronize workflow. Most of them are waiting for comments and approval in gerrit. This a good moment for comments and share thoughts about current implementation, before it will be merged into master branch. Additionally I’ve opened two bugs for discussion on Synchronize Wizard and Team menu based workflows:

  • 344891 – for Team context menu
  • 344888 – for sync-wizard

I’m understood that you don’t want to play with our code base and struggle with project setup only to check one of two new features, so here is a short description of my ideas and some screenshots:

  1. Always use current selected branch (HEAD) as source of synchronization
    As you may know currently the synchronization-dialog (this is a dialog that pops up after you select Team > Synchronize… from project’s context menu) allows select source and destination branch, and after that you can launch the synchronization action.When I was implementing this feature I had in mind git diff command, where you can easily compare two given points in repository history. But this command produces a patch-like output. You cannot move changes around, the only action that you can do with it is to only review it. In case of Synchronize View we use to use it to move changes around and prepare commit. So it is more then simple git diff.

    Another huge issue with this approach is that handling Synchronize View context menu actions like ‘Commit‘, ‘Merge‘, ‘Overwrite and Update‘ gets really complicated when base branch isn’t actual working branch.

    Because of that I’ve decided that in new workflow you need to select only a destination branch! For now I think that I’ll not remove this functionality from EGit code, because maybe in feature we’ll found a use case when comparing two given branches without switching on one of them could be use full.

  2. Always fetch changes before synchronization
    I’ve spotted that new git users that comes from CVS or SVN doesn’t really get the idea of ‘fetching changes locally’. They launch synchronization and want to see incoming changes without fetching them into repository. This feature could be also useful for old git users, because it will save couple of mouse clicks ;)The implementation that I’ve proposed don’t launch fetch action all the time. It checks does current branch tracks any remote branch, if yes it fetches changes from that remote, otherwise it will not do anything.

    OK, but what when I’m off-line and I’m working on branch that tracks origin/master, I would be forced to wait until connection timeout occurs?

    Well … it depends 😉 Every time fetch fails you will be informed by dialog that you can disable automatic-fetch in Team > Git preferences. So in the worst case you will wait only once for connection timeout, then you will be informed that you can disable auto-fetch.

  3. New Synchronize Wizard
    After discussion in bug 344888 I’ve decided to abandon changes that I’ve presented two weeks ago. And implement it this way:EGit Synchronize Wizard

    As you can see there is only destination branch. Also all project names have decorations that describes repository name and current branch name. There i a single button for include local changes and that’s all, no more pages and logic’s.

    Additionally, the Destination branch list will contain also additional refs like FETCH_HEAD

  4. New context menu option ‘Synchronize Workspace’

    EGit Synchronize Workspace

    This a short cut for comparing HEAD against HEAD with included locally made changes. It is useful when you want to see what changes you made in workspace before you commit them. I think that this is a most use case for synchronize view therefore I’ve decided to add this shortcut.

  5. No more Synchronize-dialog
    This a Matthias Sohn idea to replace synchronize-dialog with dynamically build sub menu entry (same as in ‘Switch To‘ action).

    EGit Synchronize... Sum Menu

    This sub menu will contain 20 elements at most. They are alphabetically ordered and include local and remote branches, tags and additional refs like FETCH_HEAD. You don’t find here HEAD and name of current selected branch, because for synchronizing against those is ‘Synchronize Workspace‘ action. If you don’t see branch or tag that you would like to synchronize against, you need to choose ‘Custom…‘ option. After that the Synchronize Wizard will be shown where you can choose destination from complete list of branches, tags and additional refs.

  6. Push and Pull actions in Synchronize View toolbar
     

    EGit SynchronizeView Push and Pull Toolbat Actions

    This change is merged into current master, so if you want to check out that you need to install a nightly build of EGit.

    There is only one limitation for push action, it works only when you are synchronizing one repository. It would be disabled when you synchronizing more then one repo.

It is all for now. Any ideas and comments are welcome (and needed ;))!.

New synchronize wizard for EGit

Here are some screenshots of new redesigned steps in EGit synchronization wizard. The main idea is to make most common synchronization usage as short as it is possible. So I come with idea of ‘predefined synchronization configurations’. In first step of sync-wizard you can select from three options:

  • Working Tree
  • Remote Tracking
  • Custom

EGit new synchronization wizard - page 1

After selecting first option you only uncommitted local changes (those that are stager/in index and those that aren’t). This should help you when you want to review yours changes just before commit or stage some changes.

Second option ‘Remote Tracking’ is only available when current branch tracks one of yours remote branches (if it doesn’t this option is disabled). This option will show you all locally made changes (same as in ‘Working Tree’) but also changes made in commits that occurs in local and remote branch after you started the local one.

Selecting one of those two actions (and selecting the checkbox next to repository name) will enable the ‘Finish’ button and you are done with launching synchronization.

In case when at least in one repository the ‘Custom’ option is selected, the ‘Finish’ button will be disabled, but ‘Next >’ should be enabled. In this situation you must go to the next page to setup yours custom synchronization.

On ‘custom synchronization step’ you will only see repositories that was chosen to have a custom synchronization. Here you can select source and destination branch and also include or exclude local changes from being shown in synchronization results.

EGit new synchronization wizard - page 2

Those changes are currently pending for a review in ours gerrit, but I think that they will be merged into master branch (not they won’t be included in 0.12 release).

What do you think about such approach for git synchronization? Maybe you see some other ‘predefined’ day-to-day synchronization configuration, if yes please let me know!