Wednesday, April 05, 2006

It's been a long time since I've posted something "chewable". My energies were all directed on the soon to be released Xceed Zip for .NET version 3.0. Though the apparent changes on the public interface are quite minor, and one can teach the new classes quite easily, the underlying code wasn't trivial.

In short, for all of you who know what ZipArchive, ZippedFile and ZippedFolder are, say hello to TarArchive, TarredFile, TarredFolder, GZipArchive and GZippedFile.

And when I say "easy to teach", what it really means is "find yourself a zipping example, and replace class name occurances of Zip{something} with Tar{somethong} or GZip{something}".

Sure, there are some gotchas, like the fact that a GZIP archive cannot contain filenames with subfolders, are not well-suited to contain more than one compressed file, and can contain files without filenames. But these are details you'll get used to quite easily.

There are two things that make me really proud in that product. One is under the hood, and the other is a sample. First, the engine: My colleague Jacques and I have come up with what we call the "Storage Engine". It's an abstraction of what an archiving library needs in term of temp storage, in-place archive updating, and transactional operations on an archive. Both the new TAR and GZIP implementations use it. In short, it abstracts the fact that we want to always update an archive in-place when possible, but revert to temp files and make sure to commit those temp files with any existing archive upon the last modification of it. If things go well, the ZIP implementation will benefit from it sooner than later.

Second, the sample: The FTP Sample Explorer is gone, replaced with the FileSystem Snippet Explorer, a sample that let's you see, modify and run code snippets that show you the various tasks one might wish to implement. It goes straight to the point. No bells and whistle, no gravy, just the meat. The code is embedded in the executable as compressed serialized XML data. The main information (each topic's description and code) is nothing else than rich text. The nice thing about this sample is that in order for me to modify and add new topics, I simply need to compile the project with an extra define, and I'm now running the application in "admin" mode, enabling me to update the compressed XML file directly, for the next compilation to benefit from this update.

Though I've finished work on this 3.0 version, I already have both hands in the two next releases of Xceed Zip for .NET and Xceed FTP for .NET. The first one will add support for AES encryption, and the second one will now offer proxy support.

I didn't have much time to write because all those releases have a tight schedule I can't bust. I'm leaving Xceed in two months. Yup, I've decided it was time for me to move on. Until then, I have agreed to complete AES implementation, help Jacques kick-start proxy support and train about everybody here, each earning one of the numerous hats I'm wearing. It was a very difficult decision, since I have only friends at Xceed. Though the nine years or so I've spent here were exciting and challenging, I feel it's time for me to try new stuff... by myself. This isn't a divorce. I won't be far from Xceed, and still available to help them from time to time. As for you, dear customers and readers, rest assured you will stay in good hands. The team behind Xceed Zip and Xceed FTP, both .NET and ActiveX, will remain strong, even get stronger than it is now.


.NET | FTP | General | Zip

4/5/2006 8:43:13 AM (Eastern Daylight Time, UTC-04:00)  #   
 Friday, March 10, 2006

Last Wednesday, I was invited to the Visual Studio Talk Show, a French podcast about software development. I enjoyed the experience a lot, but can only understand how you can be intimidated when it's the first time you express yourself such in a "live" way. I may have experience teaching in front of a small crowd, or presenting in front of a larger one, you cannot prepare yourself in the same way when you're an invitee, and depend on your host's questions and direction.

For example, everybody agrees that Scott Hanselman's HanselMinutes improved dramatically, no later than starting with his second podcast. That's probably why I'm left with mixed emotions. I'd repeat the experience anytime, just to have an opportunity to improve. But this time, I'd be the driver!



3/10/2006 12:07:10 PM (Eastern Standard Time, UTC-05:00)  #   
 Monday, March 06, 2006

Last week, I got a car accident. I was driving my son to the kindergarden. We were lucky, I hit the side of the other car, who was going in the opposite direction, so the impact was more friction than collision. My air bags did not open, that's a sign the impact was not a head-on collision.

My first instinct was to make sure my boy was ok. He was, top shape, asking what was the noise he heard. I moved the car in a nearby parking lot, so did the lady in the other car. Everybody seemed ok. It felt good. "It's only metal" I kept saying to myself and the lady, who was very sorry about her mistake. She did not notice the red flashing lights indicating a defective light and requiring full stop. "I thought it was green" she even said.

Anyway, my boy's fine, I'm fine, the car's at the garage and will be fixed soon (I hope), without me requiring to spend a single penny (thanks to a no-fault), courtesy car included. And to everybody I tell this story, I keep saying "It's only metal. The important thing is my son and I are ok."...

No... I'm lying... It's not only metal. It's a damn bruise on your pride. I should have made sure all was clear before turning left on that red flashing light. I should have been a 100% focused instead of being distracted by my son pointing the damaged light post to our right. I should have steered the wheel full right faster when I saw the other car about to collide. It went so fast, I'm stuck with the feeling I did nothing, I was a spectator.

Last week has been a difficult week. I've played real cool and calm with my son and wife, making sure they forget quickly. But something's broken... and it's not only metal.



3/6/2006 8:53:24 AM (Eastern Standard Time, UTC-05:00)  #   
 Monday, February 13, 2006

I've seen some strange behavior from my 3 year old son ClĂ©ment lately. I fear heredity...


Fun

2/13/2006 11:37:22 AM (Eastern Standard Time, UTC-05:00)  #   
 Thursday, February 09, 2006

Wonder what's the formula behind Xceed's build numbers? Here's the secret recipe:

( Year - 2000 ) * 1000 + ( Month * 50 ) + ( Day )

Heck, we even made ourselves an Xceed Version Yahoo! Widget!!!

Update: Until I learn how to open the ".widget" extension for downloading in dasBlog, I renamed the file to "Xceed Version.widget.zip". Just rename to "Xceed Version.widget" once downloaded.


Fun

2/9/2006 3:24:57 PM (Eastern Standard Time, UTC-05:00)  #   
 Tuesday, January 31, 2006

Ever since I've been working with the .NET framework, most of my time was spent on the System.IO namespace. I'm not a UI guy, I'm an IO guy! The most important class in that namespace is System.IO.Stream. And since it was well-designed, probably inspired by other successful stream implementations (Delphi comes to mind), it's very easy to expose features using streams.

My favorite use of streams is for pass-through streams. A pass-through stream is a class which derives from System.IO.Stream, but reads from or writes to an inner stream received at creation. It serves as a data modifyer or data analyser. When reading from a pass-through stream, it first reads from its inner stream, then processes the data read (potentially modifying it) and returns this data. When writing to a pass-through stream, it first processes the provided data (again potentially modifying it), then writes it to its inner stream.

Xceed Zip for .NET and Xceed FTP for .NET both use a pletoria of pass-through streams. The most popular is Xceed.Compression.CompressedStream, the stream responsible for compressing data before writing it to its inner stream, or decompressing data read from its inner stream. But most others are internal. We've been juggling with the idea of exposing them for a long time, but beleive it would only confuse developers to "see" those new namespaces and classes. Another useful thing with internal classes is that we can change their interface without causing breaking changes.

TransientStream

It was a long debate before we decided to go forth with the "transient" keyword. Not only is it used in the TransientStream type name, but also as a property on many of our pass-through streams. What we meant by "transient" is "volatile", or if you prefer more explicit keywords, "does-not-close-its-inner-stream-when-closed". A TransientStream is about the simplest expression of a pass-through stream. All required property and method overrides simply call the inner stream. The only exception is for the Close method, which simply makes sure not to call Close on the inner stream. This is very useful when you need to pass your stream to another routine which closes the stream, while you don't want your stream to get closed.

ChecksumStream

This stream does not modify the data read from or written to, but takes the opportunity to calculate either a CRC32 or an Adler32 on that data. When reading, it can also make sure, upon closing it, that the calculated checksum matches an expected stream, else throw an exception. In this way, we can insert checksum calculation anywhere in a process without interfering nor requiring code changes.

CombinedStream

The deflate compression algorithm has the ability to detect the end of the data when decompressing. The CompressedStream is itself a pass-through stream. When reading from it, it first reads from the inner stream, then decompresses the data. When it reaches the end of the compressed data, the CompressedStream has the ability to return a stream on the remaining data, in case this inner stream contains more data after the compressed block. Why isn't this equivalent to the inner stream you might ask? Let's say the inner stream isn't seekable. The CompressedStream's Read method first reads N bytes from the inner stream, but may have found that the end of the compressed data is after M bytes (M < N). The inner stream is already N-M bytes too far. The CombinedStream receives both a byte array (the unused N-M bytes) and the inner stream as ctor parameters, and will expose those as one contiguous stream. Pretty slick!

HeaderFooterStream

Xceed Streaming Compression for .NET exposes stream-based (as opposed to archive-based) compression formats. Those formats all have one thing in common: they have a header and a footer. Not all of them can depend on the deflate algorithm to automatically detect the end of the stream. That's why they need to make sure to never return the first M bytes and last N bytes from the inner stream, where M is the expected header size and N the expected footer size.

WindowStream

When exposing part of a zip file as a single AbstractFile, we need to make sure we do not read past the boundaries of that file's data in the zip file. The WindowStream exposes a region of its inner stream as a zero-position, N-length stream.

ZCryptStream

This pass-though stream automatically encrypts or decrypts the data written or read, using the basic Zip encryption (which is as weak as me in front of a cheese cake). I will be working on AES encryption very soon, and it will most probably be implemented as a pass-through stream too!

NotifyStream

Though pass-through streams can do much of the task, it is often better for the clarity of the code to have processing done by other classes not deriving from System.IO.Stream. The NotifyStream class exposes three events: ReadingFromStream, WritingToStream and ClosingStream. Any other class can advise for those events to intervene in the reading or writing process. This old class exists since the beginning of Xceed Zip for .NET, but it has proven very useful in the current development we are doing for Tar and GZip support within Xceed Zip for .NET.

ForwardSeekableStream

This new class created for Xceed Zip for .NET 3.0 (Tar and GZip support) can expose a non-seekable stream as a seekable stream when reading, or at least a stream reporting a Position when writing. When reading, you can call Seek with an offset behond the current position, and it will simply read from the non-seekable inner stream until well positioned. And for both reading and writing operations, it counts the number of bytes read or written so it can report a position (granting we knew the original position when created).

FtpAsciiDataStream

Xceed FTP for .NET also uses pass-through streams. For example, the FtpAsciiDataStream wraps the NetworkStream to perform convertion of LF to CR/LF on the fly when sending a file in ASCII mode.


.NET | Compression | FileSystem | FTP | Zip

1/31/2006 9:47:29 AM (Eastern Standard Time, UTC-05:00)  #   
 Tuesday, January 17, 2006

Found this from Scott... Hey! It was my idea!


Fun | General

1/17/2006 9:46:11 AM (Eastern Standard Time, UTC-05:00)  #   

One of the less known features of the Xceed FileSystem is its file filtering capabilities. Not only does it come with built-in support for filtering files based on name, size, attributes and dates, it also lets you easily combine criterias. Furthermore, as for all Xceed components, it's fully extensible.

For example, let's say I want to copy files matching the "*.txt" filter that have the archive attribute on. The following code can be used:

  sourceFolder.CopyFilesTo( destFolder, true, true, "*.txt", FileAttributes.Archive );

What is happening beneath the surface? The fouth parameter is "params object[] filters". This means you can provide any number of any types of parameters. Any types? Not exactly. What you should see is "params Filter[] filters". The Filter class is the base class for any type of filter you could think of. The Xceed FileSystem comes with seven built-in filter classes, divided in two categories:

Operators: AndFilter, OrFilter, NotFilter.
Filters: NameFilter, AttributeFilter, SizeFilter, DateTimeFilter.

So the line of code above can be seen as this:

  sourceFolder.CopyFilesTo( destFolder, true, true,
    new AndFilter( new NameFilter( "*.txt" ), new AttributeFilter( FileAttributes.Archive ) ) );

But we've decided that forcing the creation of a new NameFilter everytime you want to filter on a mask was overkill for such a common operation. That's why we also accept two other types of parameters. Strings are automatically converted to a NameFilter, and FileAttributes are automatically converted to an AttributeFilter. Finally, providing two or more filters as separate parameters automatically puts them in an AndFilter.

But then, what happens to another common scenario: filtering files based on two name filters? Passing "*.txt" as the fourth parameter, and "*.doc" as the fifth would generate an AndFilter around them, thus only matching files that match the ".txt" and the ".doc" extensions... Oups!

We support yet another exception: any string filter can contain a pipe character (|) for providing multiple name filters that will be grouped in an OrFilter, like this:

  sourceFolder.CopyFilesTo( destFolder, true, true, "*.txt|*.doc" );

This will automatically be translated to:

  sourceFolder.CopyFilesTo( destFolder, true, true, 
    new OrFilter( new NameFilter( "*.txt" ), new NameFilter( "*.doc" ) ) );

By the way, most operator-like filters' constructors will accept strings and FileAttributes too, doing the translation to NameFilter and AttributeFilter instances for you.

The final "hidden" feature relates to case sensitivity. By default, the FileSystem is case insensitive, as is the Windows platform. But since archives like zip files may come from other planets like Linux or Mac OS X, we wanted to support case-sensitive file matching. If you prepend your string with the "greater than" character (<), the resulting NameFilter will be case-sensitive. The following code will only match files which have their extension in upper-case:

  sourceFolder.CopyFilesTo( destFolder, true, true, ">*.TXT" );

Since Windows does remember the casing of filenames, this can be very useful even on the Windows platform. Furthermore, since we released the library, the Mono project came to life, and our library can now be used on other platforms.

Extending filters

You can easily create custom filters by deriving from the Xceed.FileSystem.Filter class and overriding the IsItemMatching method. A SearchFilter class, which searches for a particular text within files could look like this:

  class SearchFilter : Filter
  {
    public SearchFilter( string text )
      : base( FilterScope.File )
    {
      if( text == null )
        throw new ArgumentNullException( "text" );

      if( text.Length == 0 )
        throw new ArgumentException( "The text cannot be empty.", "text" );

      m_text = text;
    }

    public override bool IsItemMatching( FileSystemItem item )
    {
      AbstractFile file = item as AbstractFile;

      if( file == null )
        return false;

      try
      {
        int bufferSize = ( file.Size < 0x1000000 )
          ? unchecked( ( int )file.Size )
          : 0x1000000;

        byte[] search = System.Text.Encoding.Default.GetBytes( m_text );

        if( search.Length <= bufferSize )
        {
          byte[] buffer = new byte[ bufferSize ];
          int found = 0;

          using( BinaryReader reader = new BinaryReader( file.OpenRead( FileShare.ReadWrite ) ) )
          {
            int read = 0;

            while( ( read = reader.Read( buffer, 0, bufferSize ) ) > 0 )
            {
              found = FindBuffer( buffer, 0, read, search, found );

              if( found == search.Length )
                return true;
            }
          }
        }     
      }
      catch {}

      return false;
    }

    private int FindBuffer(
      byte[] source,
      int sourceStart,
      int sourceCount,
      byte[] search,
      int searchIndex )
    {
      // TODO: Param check!

      for( int i=0; i<sourceCount; i++ )
      {
        if( source[ sourceStart + i ] == search[ searchIndex ] )
        {
          if( ++searchIndex == search.Length )
            return searchIndex;
        }
        else
        {
          searchIndex = 0;
        }
      }

      return searchIndex;
    }

    private string m_text; // = null
  }

Using this custom filter, you can now copy only files that contain a particular text:

  sourceFolder.CopyFilesTo( destFolder, true, true, new SearchFilter( "allo" ) );

Conditionally recursing

One missing feature we had with the filtering will be addressed with today's release. There were no way to control which subfolders to recurse into or not when calling methods accepting filters (CopyFilesTo, MoveFilesTo, GetFiles, GetFolders). The FilterScope.Folder value wasn't preventing recursing into subfolders. It was only meant to include or exclude folder entries from being copied. But passing "true" or "false" as the "recurse" parameter was an all or nothing deal.

Today, we introduce a new scope: FilterScope.Recurse. It does not interfere with the File or Folder socpe, and only determines if we should continue matching files into each subfolder. Its number one use is for excluding subfolders:

  sourceFolder.CopyFilesTo( destFolder, true, true, 
    "*.txt", new NotFilter( new NameFilter( "Bar", FilterScope.Recurse ) ) );

The way you combine "Recurse" filters and other filters is irrelevant. When deciding to copy files or folders, the library ignores any filters with the Recurse scope. When deciding to call itself recursively, the library ignores any filters with the File or Folder scope.

We plan on providing new types of filters. Suggestions welcomed!



1/17/2006 9:35:07 AM (Eastern Standard Time, UTC-05:00)  #   
 Wednesday, January 11, 2006

First, I wish all my readers health and happyness for the new year.

Now, let's jump into the subject of the day: Scott Hanselman's HanselMinutes. I'm currently listening to his first podcast. I've never been a real fan of podcasts, but since Scott Hanselman is about my number 1 blogger, I could not miss this event.

Hmmm, how can I express my feelings about podcasting without hurting Scott's feelings? Is it me, or are computer subjects not fit for audio? I want links! I want screenshots! I want examples! I want immediate access to extended information upon my needs! With a podcast, I'm stuck listening to all the stream. Sure, I can fast forward, but you end-up playing the "find that show you recorded" game you play with your VCR. Worse, you don't know what you're looking for. You are at the mercy of the podcaster. You can't filter, you can't opt in or out of a subject.

Maybe I'm not listening podcasts at the proper moment? Maybe I'm trying to use podcasts as if they were audio blogs, which they are not? I tried listening to a podcast in my car on the way to work, just to discover I was sad missing the local news and forecasts I usually listen to in the morning. I tried listening to a podcast at home in the evening when I push my computer geekness to its limits by moving back to a computer, but I generally need to disconnect from work, and I prefer playing Guild Wars! I tried listening to a podcast in bed before getting to sleep, just to find out I prefer doing other things in bed... like sleeping... and... ok, you get the picture!

The funny part is that I've been approached by the Visual Studio Talk Show for a 45 minutes podcast-style interview in French, and I've said yes. But it's only a one time deal. Even though this show is mostly accessible as a podcast, I see this as an interview, and no way I could maintain a weekly podcast.

So I'll conclude with Scott's own words: podcasting sucks. It wastes my precious time. I would have liked it very much if Carl Franklin would have asked Scott about his background, his developer path, about himself. I want to know more about Scott. For links, I'll continue reading his blog.

Oh, and one more thing: the damn advertising is barely tolerable.



1/11/2006 9:09:26 AM (Eastern Standard Time, UTC-05:00)  #