Thursday, June 16, 2005

Customer requests come in waves, as if fashion was driving the development industry. Lately, many customers were trying to compress log files. I've deviced it was time for a little sample.

The idea was to encode each string message in unicode and compress it in a plain file, one after the other. I could have used a zip file with each file entry representing a message, but for small messages, the zip headers would take too much space for nothing, wasting the need for compression in the first place.

The deflate compression method has one nice feature: it can detect the end of the compressed data while decompressing, without knowing the total compressed size. That's why the CompressedStream class exposes a GetRemainingStream method for retrieving a Stream reference on the rest of the data in the inner stream.

I've kept the sample real simple, so you get the general idea:

using System;
using System.IO;
using Xceed.Compression;
 
namespace CompressedLogExample
{
  public class CompressedLog
  {
    public CompressedLog( string filename )
    {
      if( filename == null )
        throw new ArgumentNullException( "filename" );
 
      if( filename.Length == 0 )
        throw new ArgumentException( "The filename cannot be empty.", "filename" );
 
      Xceed.Compression.Licenser.LicenseKey = "SAMPLE-APPLICATION-KEY";
      m_filename = filename;
    }
 
    public void AddMessage( string message )
    {
      if( message == null )
        throw new ArgumentNullException( "message" );
 
      lock( m_lock )
      {
        using( Stream fileStream = new FileStream( m_filename, FileMode.Append ) )
        {
          using( CompressedStream compStream = new CompressedStream( fileStream ) )
          {
            byte[] encodedMessage = System.Text.Encoding.Unicode.GetBytes(
              DateTime.Now.ToString() + Environment.NewLine + message );
 
            compStream.Write( encodedMessage, 0, encodedMessage.Length );
          }
        }
      }
    }
 
    public void DisplayMessages( TextWriter writer )
    {
      if( writer == null )
        throw new ArgumentNullException( "writer" );
 
      lock( m_lock )
      {
        try
        {
          using( Stream originalStream = new FileStream( m_filename, FileMode.Open ) )
          {
            Stream workStream = originalStream;
 
            do
            {
              using( CompressedStream compStream = new CompressedStream( workStream ) )
              {
                // We don't want compStream to close sourceStream!
                compStream.Transient = true;
 
                using( StreamReader reader = new StreamReader( 
                         compStream, System.Text.Encoding.Unicode ) )
                {
                  writer.WriteLine( reader.ReadToEnd() );
                  writer.WriteLine( "---" );
 
                  // Before closing the reader (thus compStream), acquire a stream on
                  // the rest of the data if present.
                  workStream = compStream.GetRemainingStream();
 
                  // We must do this AFTER calling GetRemainingStream since compStream
                  // may have read more from its inner stream than necessary.
                  if( workStream.Position == workStream.Length )
                  {
                    workStream = null;
                  }
                }
              }
            }
            while( workStream != null );
          }
        }
        catch( FileNotFoundException )
        {
          writer.WriteLine( "The log is empty." );
        }
      }
    }
 
    private string m_filename = string.Empty;
    private object m_lock = new object();
  }
}

Unfortunately, you can't replace the CompressedStream with a formatted stream like GZipCompressedStream, because they do not expose a GetRemainingStream method yet. Too bad, since the GZipCompressedStream can store a few minimal informations in its header. I'll have to open a feature request about that!

Here is some sample code for using this CompressedLog class:

    static void Main(string[] args)
    {
      CompressedLog log = new CompressedLog( @"d:\temp\log.cmp" );
 
      while( true )
      {
        Console.WriteLine( "Write your next message below (empty message to quit):" );
 
        string line = Console.ReadLine();
 
        if( line.Length == 0 )
          break;
 
        log.AddMessage( line );
      }
 
      Console.WriteLine();
      Console.WriteLine( "Your messages were:" );
 
      log.DisplayMessages( Console.Out );
 
      Console.WriteLine( "Press <Enter> to quit." );
      Console.ReadLine();
    }


6/16/2005 3:04:00 PM (Eastern Daylight Time, UTC-04:00)  #   
 Wednesday, February 02, 2005

I've been working part time (translation: I should be working on something else) on a new sample: my own Command Prompt. I know, I'm reinventing the wheel, not to count that Microsoft will launch a new one called msh (codename Monad). But it was more a concept or proof around exposing AbstractFolder and AbstractFile within a command prompt.

E:\>dir

  Directory of E:\

      DATE     TIME     SIZE or TYPE NAME
27/12/2004  4:23 PM         [FOLDER] Backup
03/11/2004 10:33 AM         [FOLDER] Chart30
24/11/2004 10:09 AM         [FOLDER] CLR Profiler
11/01/2005  5:24 PM         [FOLDER] Config.Msi
24/11/2004 10:10 AM         [FOLDER] My Music
12/01/2005  4:13 PM         [FOLDER] My Pictures
10/09/2004  1:52 PM         [FOLDER] RECYCLER
30/09/2004  8:40 PM         [FOLDER] System Volume Information
02/02/2005  2:49 PM         [FOLDER] temp
03/11/2004 10:33 AM         [FOLDER] XceedProjectsNET
25/01/2005  7:06 AM              143 toto.txt

  Files: 1  Folders: 14  Total file size: 143

E:\>copy toto.txt temp
 100%
E:\>cd temp
E:\temp\>

As you can see, I can list the contents of folders, copy files, and change the working folder. The application simply manages a working "AbstractFolder", and enables commands to act on that folder (or an AbstractFolder obtained from an absolute path).

The sample quicky evolved into a prototype for upcoming features. Among other things, I needed a way to recognize a path like "E:\temp\test.zip\images" as a ZippedFolder within a zip file. Let's stop the talking, and show some traces:

E:\temp\>md test.zip
E:\temp\>md test.zip\images
E:\temp\>copy "..\My Pictures\Chalet\*" test.zip\images
 100%
E:\temp\>

What have I done here? Create a folder named "test.zip"? Well, the "md" command recognized the ".zip" extension as a request to create a new empty zip file. The second "md" command actually created a new folder within the zip file. And the paths can freely use the zip filename as a folder part for any command, as shown with the copy example. If we display the contents of "E:\temp", we see the two expected files:

E:\temp\>dir

  Directory of E:\temp\

      DATE     TIME     SIZE or TYPE NAME
02/02/2005  3:00 PM         40068736 test.zip
25/01/2005  7:06 AM              143 toto.txt

  Files: 2  Folders: 0  Total file size: 40068879

E:\temp\>

As you can see, "test.zip" is really a file (DiskFile) within "E:\temp" (DiskFolder). What happens if I try changing the current folder into that zip file?

E:\temp\>cd test.zip
E:\temp\test.zip\>dir

  Directory of E:\temp\test.zip\

      DATE     TIME     SIZE or TYPE NAME
02/02/2005  3:00 PM         [FOLDER] images

  Files: 0  Folders: 1  Total file size: 0

E:\temp\test.zip\>cd images
E:\temp\test.zip\images\>dir

  Directory of E:\temp\test.zip\images\

      DATE     TIME     SIZE or TYPE NAME
06/08/2000  4:40 PM          6400006 Chaises.bmp
06/08/2000  4:35 PM          6348550 Chute.bmp
06/08/2000  4:29 PM          6337678 Ciel1.bmp
06/08/2000  4:30 PM          6396226 Ciel2.bmp
06/08/2000  4:33 PM          6414418 Ciel3.bmp
06/08/2000  4:38 PM          6524278 Couple.bmp
06/08/2000  4:37 PM          6388054 Martine.bmp
06/08/2000  4:32 PM          6405478 Ombre.bmp
06/08/2000  4:41 PM          6359254 Rochers.bmp

  Files: 9  Folders: 0  Total file size: 57573942

E:\temp\test.zip\images\>

The zip file is exposed as a folder, because the path "E:\temp\test.zip" was recognized and mapped to a ZippedFolder around a DiskFile. And "images" is nothing more than a subfolder within that root ZippedFolder, actually something like:

new ZippedFolder( new DiskFile( @"E:\temp\test.zip" ), @"\images" );

Ok, let's get into serious things:

E:\temp\test.zip\images\>cd ..\..
E:\temp\>copy *.zip RAM:\
 100%
E:\temp\>cd RAM:\test.zip\images
RAM:\test.zip\images\>dir m*.bmp

  Directory of RAM:\test.zip\images\

      DATE     TIME     SIZE or TYPE NAME
06/08/2000  4:37 PM          6388054 Martine.bmp

  Files: 1  Folders: 0  Total file size: 6388054

RAM:\test.zip\images\>

My Command Prompt exposes a root MemoryFolder called "RAM:\", which I can freely use. The commands act the same, no matter if I'm deeling with a ZippedFolder around a DiskFile or a MemoryFile. Want more?

RAM:\test.zip\images\>cd ftp://vermouth
ftp://vermouth\>dir

  Directory of ftp://vermouth\

      DATE     TIME     SIZE or TYPE NAME

  Files: 0  Folders: 0  Total file size: 0

ftp://vermouth\>md foobar.zip
ftp://vermouth\>copy "E:\My Music\WMA\Mes Aieux" foobar.zip
 100%
ftp://vermouth\>dir

  Directory of ftp://vermouth\

      DATE     TIME     SIZE or TYPE NAME
02/02/2005  3:20 PM         62936759 foobar.zip

  Files: 1  Folders: 0  Total file size: 62936759

ftp://vermouth\>dir c:\inetpub\ftproot

  Directory of c:\inetpub\ftproot\

      DATE     TIME     SIZE or TYPE NAME
02/02/2005  3:20 PM         62936759 foobar.zip

  Files: 1  Folders: 0  Total file size: 62936759

ftp://vermouth\>

FTP servers are threated as any other kind of AbstractFolder. The application simply recognize the "FTP:" prefix as a signature for a root FtpFolder, as it did with "RAM:" exposed as a MemoryFolder. The command implementations don't care what kind of AbstractFolder or AbstractFile they are dealing with.

The engine behind this involves FileSystemMapper-derived classes. They mainly get asked two kinds of questions:

Question 1: Do you recognize this path as a root?

If so, they remove the part of the path they could recognize as a root folder, and return the matching AbstractFolder.

Examples of mappers and their responsability:

  • DiskMapper : Drive letters and UNC paths (yes, you can "cd" into a UNC path!)
  • FtpMapper : The "FTP:" prefix with server name, and optional username and password (e.g. ftp://user:pass@vermouth:9999)
  • IsolatedStorageMapper : A custom prefix name like "STORE:" (that's the one my sample app supports).
  • MemoryMapper : A custom prefix used to create the initial root MemoryFolder, like "RAM:" (that's the one my app supports). You can create more than one MemoryMapper to have more than one ram drive.

Question 2: Can you represent this AbstractFile as an AbstractFolder?

If so, they simply return the matching AbstractFolder.

An example of such a mapper:

  • ZipFileMapper : It simply checks if the provided AbstractFile exists, then tries to create a ZipArchive around that AbstractFile in a try/catch. If it succeeds, it returns this ZipArchive (which derives from ZippedFolder).

Curiously, today I came across a post on our forums asking how to detect if a file is really a zip file. I gave this man the "new ZipArchive within a try/catch" solution, and he came back, as I feared, with concerns with the time wasted catching an exception for all those non-zip files. It's actually one of the bottlenecks of my Command Prompt sample. A lot of time is wasted throwing an exception for all non-zip files my app comes across. Well, I guess I'll have to work sooner than later on a new "ZipArchive.IsZipFile" method! :-)

Now, you have to convince my boss I should put more time on this sample and these new FileSystem features! Does mapping absolute paths like shown above to their proper AbstractFolder or AbstractFile something that could be usefull for you?


FileSystem | FTP | Samples | Zip

2/2/2005 3:55:51 PM (Eastern Standard Time, UTC-05:00)  #   
 Monday, December 13, 2004

I admit, I'm working in zip file compression, and I'm not even using something home made for unzipping zip files I run into. I'm using WinZip's context menu.

Well, I should talk to the past. I've made a man of myself and implemented my own "Unzip Here" context menu, which is using Xceed Zip ActiveX 5.x. Why reinvent the wheel when it works fine? Because it didn't work that fine for me.

How many times have I right-clicked on a zip file, went to the WinZip menu, stared at Extract to here and Extract to d:\someplace\somewhere\zipfilename just to find asking myself: "Does that zip file already contain paths?". If it does, I don't need to create a "zipfilename" subfolder, thus I should select the first menu item. But if it doesn't, I sure don't want all unzipped files to end up in the current folder, thus I want to select the second menu item. I end up opening the zip file just to view file paths.

That's what I just implemented. You right-click on a zip file, you click on Unzip Here, and it will automatically detect if it needs to create a subfolder (using the zip filename) or not, then unzip everything.

I won't go into the full details of how to create a Windows Shell Extension component, the sample is pretty self-explanatory, and the web is filled with tutorials. In short, you:

  • Create a new ATL COM AppWizard project (VC++ 6).
  • Add a new Simple Object with default names and attributes (make sure not to select "Free Threaded").
  • Remove references to the newly created interface, you don't need it. (I left the IDL in there instead of copying the CLSID somewhere else... I'm lazy).
  • Remove the type library from the resources and RGS file, you don't need it.
  • Implement IShellExtInit and IContextMenu interfaces (see UnzipHereExtension.cpp).
  • Add the required registry keys (see "DllRegisterServer" in UnzipHere.cpp).

The heart of the extension resides in IContextMenu::InvokeCommand. Don't forget more than one file can be selected when your context menu gets called.

While debugging, you'll often need to restart the explorer.exe in order to release usage of your DLL. Use the Task Manager's run menu to reload it. If you don't like ending a task via the Task Manager, try this: Start Menu -> Shutdown, press Ctrl-Alt-Shift and click Cancel. The explorer.exe process will end.

On my TODO list:

  • Support zip files not ending with the ZIP extension (like self-extracting zip files).
  • Implement a "Zip This" menu.
  • Add a "File already exists. Do you want to overwrite?" prompt.
  • Hide the "aborted" error on non-zip files.

Comments welcomed! Have fun!

UnzipHere.zip (21 KB)

Zip | Samples

12/13/2004 3:57:24 PM (Eastern Standard Time, UTC-05:00)  #