Tuesday, January 31, 2006

Ever since I've been working with the .NET framework, most of my time was spent on the System.IO namespace. I'm not a UI guy, I'm an IO guy! The most important class in that namespace is System.IO.Stream. And since it was well-designed, probably inspired by other successful stream implementations (Delphi comes to mind), it's very easy to expose features using streams.

My favorite use of streams is for pass-through streams. A pass-through stream is a class which derives from System.IO.Stream, but reads from or writes to an inner stream received at creation. It serves as a data modifyer or data analyser. When reading from a pass-through stream, it first reads from its inner stream, then processes the data read (potentially modifying it) and returns this data. When writing to a pass-through stream, it first processes the provided data (again potentially modifying it), then writes it to its inner stream.

Xceed Zip for .NET and Xceed FTP for .NET both use a pletoria of pass-through streams. The most popular is Xceed.Compression.CompressedStream, the stream responsible for compressing data before writing it to its inner stream, or decompressing data read from its inner stream. But most others are internal. We've been juggling with the idea of exposing them for a long time, but beleive it would only confuse developers to "see" those new namespaces and classes. Another useful thing with internal classes is that we can change their interface without causing breaking changes.

TransientStream

It was a long debate before we decided to go forth with the "transient" keyword. Not only is it used in the TransientStream type name, but also as a property on many of our pass-through streams. What we meant by "transient" is "volatile", or if you prefer more explicit keywords, "does-not-close-its-inner-stream-when-closed". A TransientStream is about the simplest expression of a pass-through stream. All required property and method overrides simply call the inner stream. The only exception is for the Close method, which simply makes sure not to call Close on the inner stream. This is very useful when you need to pass your stream to another routine which closes the stream, while you don't want your stream to get closed.

ChecksumStream

This stream does not modify the data read from or written to, but takes the opportunity to calculate either a CRC32 or an Adler32 on that data. When reading, it can also make sure, upon closing it, that the calculated checksum matches an expected stream, else throw an exception. In this way, we can insert checksum calculation anywhere in a process without interfering nor requiring code changes.

CombinedStream

The deflate compression algorithm has the ability to detect the end of the data when decompressing. The CompressedStream is itself a pass-through stream. When reading from it, it first reads from the inner stream, then decompresses the data. When it reaches the end of the compressed data, the CompressedStream has the ability to return a stream on the remaining data, in case this inner stream contains more data after the compressed block. Why isn't this equivalent to the inner stream you might ask? Let's say the inner stream isn't seekable. The CompressedStream's Read method first reads N bytes from the inner stream, but may have found that the end of the compressed data is after M bytes (M < N). The inner stream is already N-M bytes too far. The CombinedStream receives both a byte array (the unused N-M bytes) and the inner stream as ctor parameters, and will expose those as one contiguous stream. Pretty slick!

HeaderFooterStream

Xceed Streaming Compression for .NET exposes stream-based (as opposed to archive-based) compression formats. Those formats all have one thing in common: they have a header and a footer. Not all of them can depend on the deflate algorithm to automatically detect the end of the stream. That's why they need to make sure to never return the first M bytes and last N bytes from the inner stream, where M is the expected header size and N the expected footer size.

WindowStream

When exposing part of a zip file as a single AbstractFile, we need to make sure we do not read past the boundaries of that file's data in the zip file. The WindowStream exposes a region of its inner stream as a zero-position, N-length stream.

ZCryptStream

This pass-though stream automatically encrypts or decrypts the data written or read, using the basic Zip encryption (which is as weak as me in front of a cheese cake). I will be working on AES encryption very soon, and it will most probably be implemented as a pass-through stream too!

NotifyStream

Though pass-through streams can do much of the task, it is often better for the clarity of the code to have processing done by other classes not deriving from System.IO.Stream. The NotifyStream class exposes three events: ReadingFromStream, WritingToStream and ClosingStream. Any other class can advise for those events to intervene in the reading or writing process. This old class exists since the beginning of Xceed Zip for .NET, but it has proven very useful in the current development we are doing for Tar and GZip support within Xceed Zip for .NET.

ForwardSeekableStream

This new class created for Xceed Zip for .NET 3.0 (Tar and GZip support) can expose a non-seekable stream as a seekable stream when reading, or at least a stream reporting a Position when writing. When reading, you can call Seek with an offset behond the current position, and it will simply read from the non-seekable inner stream until well positioned. And for both reading and writing operations, it counts the number of bytes read or written so it can report a position (granting we knew the original position when created).

FtpAsciiDataStream

Xceed FTP for .NET also uses pass-through streams. For example, the FtpAsciiDataStream wraps the NetworkStream to perform convertion of LF to CR/LF on the fly when sending a file in ASCII mode.


.NET | Compression | FileSystem | FTP | Zip

1/31/2006 9:47:29 AM (Eastern Standard Time, UTC-05:00)  #   
 Tuesday, January 17, 2006

One of the less known features of the Xceed FileSystem is its file filtering capabilities. Not only does it come with built-in support for filtering files based on name, size, attributes and dates, it also lets you easily combine criterias. Furthermore, as for all Xceed components, it's fully extensible.

For example, let's say I want to copy files matching the "*.txt" filter that have the archive attribute on. The following code can be used:

  sourceFolder.CopyFilesTo( destFolder, true, true, "*.txt", FileAttributes.Archive );

What is happening beneath the surface? The fouth parameter is "params object[] filters". This means you can provide any number of any types of parameters. Any types? Not exactly. What you should see is "params Filter[] filters". The Filter class is the base class for any type of filter you could think of. The Xceed FileSystem comes with seven built-in filter classes, divided in two categories:

Operators: AndFilter, OrFilter, NotFilter.
Filters: NameFilter, AttributeFilter, SizeFilter, DateTimeFilter.

So the line of code above can be seen as this:

  sourceFolder.CopyFilesTo( destFolder, true, true,
    new AndFilter( new NameFilter( "*.txt" ), new AttributeFilter( FileAttributes.Archive ) ) );

But we've decided that forcing the creation of a new NameFilter everytime you want to filter on a mask was overkill for such a common operation. That's why we also accept two other types of parameters. Strings are automatically converted to a NameFilter, and FileAttributes are automatically converted to an AttributeFilter. Finally, providing two or more filters as separate parameters automatically puts them in an AndFilter.

But then, what happens to another common scenario: filtering files based on two name filters? Passing "*.txt" as the fourth parameter, and "*.doc" as the fifth would generate an AndFilter around them, thus only matching files that match the ".txt" and the ".doc" extensions... Oups!

We support yet another exception: any string filter can contain a pipe character (|) for providing multiple name filters that will be grouped in an OrFilter, like this:

  sourceFolder.CopyFilesTo( destFolder, true, true, "*.txt|*.doc" );

This will automatically be translated to:

  sourceFolder.CopyFilesTo( destFolder, true, true, 
    new OrFilter( new NameFilter( "*.txt" ), new NameFilter( "*.doc" ) ) );

By the way, most operator-like filters' constructors will accept strings and FileAttributes too, doing the translation to NameFilter and AttributeFilter instances for you.

The final "hidden" feature relates to case sensitivity. By default, the FileSystem is case insensitive, as is the Windows platform. But since archives like zip files may come from other planets like Linux or Mac OS X, we wanted to support case-sensitive file matching. If you prepend your string with the "greater than" character (<), the resulting NameFilter will be case-sensitive. The following code will only match files which have their extension in upper-case:

  sourceFolder.CopyFilesTo( destFolder, true, true, ">*.TXT" );

Since Windows does remember the casing of filenames, this can be very useful even on the Windows platform. Furthermore, since we released the library, the Mono project came to life, and our library can now be used on other platforms.

Extending filters

You can easily create custom filters by deriving from the Xceed.FileSystem.Filter class and overriding the IsItemMatching method. A SearchFilter class, which searches for a particular text within files could look like this:

  class SearchFilter : Filter
  {
    public SearchFilter( string text )
      : base( FilterScope.File )
    {
      if( text == null )
        throw new ArgumentNullException( "text" );

      if( text.Length == 0 )
        throw new ArgumentException( "The text cannot be empty.", "text" );

      m_text = text;
    }

    public override bool IsItemMatching( FileSystemItem item )
    {
      AbstractFile file = item as AbstractFile;

      if( file == null )
        return false;

      try
      {
        int bufferSize = ( file.Size < 0x1000000 )
          ? unchecked( ( int )file.Size )
          : 0x1000000;

        byte[] search = System.Text.Encoding.Default.GetBytes( m_text );

        if( search.Length <= bufferSize )
        {
          byte[] buffer = new byte[ bufferSize ];
          int found = 0;

          using( BinaryReader reader = new BinaryReader( file.OpenRead( FileShare.ReadWrite ) ) )
          {
            int read = 0;

            while( ( read = reader.Read( buffer, 0, bufferSize ) ) > 0 )
            {
              found = FindBuffer( buffer, 0, read, search, found );

              if( found == search.Length )
                return true;
            }
          }
        }     
      }
      catch {}

      return false;
    }

    private int FindBuffer(
      byte[] source,
      int sourceStart,
      int sourceCount,
      byte[] search,
      int searchIndex )
    {
      // TODO: Param check!

      for( int i=0; i<sourceCount; i++ )
      {
        if( source[ sourceStart + i ] == search[ searchIndex ] )
        {
          if( ++searchIndex == search.Length )
            return searchIndex;
        }
        else
        {
          searchIndex = 0;
        }
      }

      return searchIndex;
    }

    private string m_text; // = null
  }

Using this custom filter, you can now copy only files that contain a particular text:

  sourceFolder.CopyFilesTo( destFolder, true, true, new SearchFilter( "allo" ) );

Conditionally recursing

One missing feature we had with the filtering will be addressed with today's release. There were no way to control which subfolders to recurse into or not when calling methods accepting filters (CopyFilesTo, MoveFilesTo, GetFiles, GetFolders). The FilterScope.Folder value wasn't preventing recursing into subfolders. It was only meant to include or exclude folder entries from being copied. But passing "true" or "false" as the "recurse" parameter was an all or nothing deal.

Today, we introduce a new scope: FilterScope.Recurse. It does not interfere with the File or Folder socpe, and only determines if we should continue matching files into each subfolder. Its number one use is for excluding subfolders:

  sourceFolder.CopyFilesTo( destFolder, true, true, 
    "*.txt", new NotFilter( new NameFilter( "Bar", FilterScope.Recurse ) ) );

The way you combine "Recurse" filters and other filters is irrelevant. When deciding to copy files or folders, the library ignores any filters with the Recurse scope. When deciding to call itself recursively, the library ignores any filters with the File or Folder scope.

We plan on providing new types of filters. Suggestions welcomed!



1/17/2006 9:35:07 AM (Eastern Standard Time, UTC-05:00)  #   
 Tuesday, November 01, 2005

Just in case my previous post on the subject did not ring a bell, the release of version 2.1 of Xceed FTP for .NET means you can directly unzip from a zip file located on an FTP server, without downloading the file first! Look at the following code:

  using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) )
  {
    FtpFile source = new FtpFile( connection, @"/images/Flowers/Backup/Flowers.zip" );
    DiskFolder dest = new DiskFolder( @"d:\temp\flowers" );

    ZipArchive zip = new ZipArchive( source );
    zip.CopyFilesTo( dest, true, true );
  }

The secret behind this code is the kind of stream "FtpFile.OpenRead" returns. Though we are dealing with a network connection, this stream is fully seekable! The FtpFile takes advantage of the "REST" FTP command, which tells the FTP server we wish to start the transfer at a specific offset. Thus, when the ZipArchive needs to seek at the end of the file to locate the ending header, a proper "REST" command is issued to avoid having to read all the zip file first. And the same happens when reading the central directory, or unzipping specific files.


.NET | FileSystem | FTP | Zip

11/1/2005 4:15:40 PM (Eastern Daylight Time, UTC-04:00)  #   
 Friday, October 28, 2005

I have put forward my incredible talent with ASCII art (sic) and updated my wonderful "FileSystem.txt" file, which describes classes available with the Xceed FileSystem for .NET.

                             ==============
                             FileSystemItem
                             ==============
                                    |
                     +--------------+------------------+
                     |                                 |
              ==============                     ============
              AbstractFolder                     AbstractFile
              ==============                     ============
                     |                                 |
         +---+---+---+---+---+---+     +---+---+---+---+---+---+---+
         |   |   |   |   |   |   |     |   |   |   |   |   |   |   |
 ==========  |   |   |   |   |   |     |   |   |   |   |   |   |  ========
 DiskFolder  |   |   |   |   |   |     |   |   |   |   |   |   |  DiskFile
 --============  |   |   |   |   |     |   |   |   |   |   |  ==========--
   MemoryFolder  |   |   |   |   |     |   |   |   |   |   |  MemoryFile
   --==============  |   |   |   |     |   |   |   |   |  ============--
     IsolatedFolder  |   |   |   |     |   |   |   |   |  IsolatedFile
     ------============  |   |   |     |   |   |   |  ============----
           ZippedFolder  |   |   |     |   |   |   |  ZippedFile
           -------=========  |   |     |   |   |  =======-------
                  FtpFolder  |   |     |   |   |  FtpFile
                  -============  |     |   |  ==========-
                   TarredFolder  |     |   |  TarredFile
                   ---=============    |  ===========---
                      GZippedFolder    |  GZippedFile
                      -------------   ==========-----
                                      StreamFile
                                      ----------

Can you guess what I'm working on?



10/28/2005 3:36:45 PM (Eastern Daylight Time, UTC-04:00)  #   
 Wednesday, October 05, 2005

I'm glad to announce that Xceed FTP for .NET 2.1 is now available for download. I've been working on this release for the past few months, and I'm very excited to finally see the FTP FileSystem come to life.

For those not familiar with the Xceed FileSystem, which comes with Xceed Zip for .NET, here is some code that sheds some light on what you can do with it. Consider these variables of the same base type:

// A file on disk
AbstractFile first = new DiskFile( @"d:\FileSystem.txt" );
 
// Another file on disk
AbstractFile second = new DiskFile( @"c:\temp\AnotherFileSystem.txt" );
 
// A file compressed in a zip file on disk
AbstractFile third = new ZippedFile( 
  new DiskFile( @"c:\temp\data.zip" ), "FileSystemInAZip.txt" );;
 
// A file in the isolated storage
AbstractFile fourth = new IsolatedFile( "Isolated.txt" );
 
// A file in memory (random name)
AbstractFile fifth = new MemoryFile();
 
// A file compressed in a zip file in memory
AbstractFile sixth = new ZippedFile( 
  new MemoryFile(), "VolatileFileSystem.txt" );

You can copy files around very easily:

// Copying the first file anywhere else is always the same!
first.CopyTo( second, true );
first.CopyTo( third, true );
first.CopyTo( fourth, true );
first.CopyTo( fifth, true );
first.CopyTo( sixth, true );

And accessing the contents of any file is always the same:

private void DisplayTextFile( AbstractFile file )
{
  Console.WriteLine( "Displaying the contents of {0}, which is a {1}.", 
    file.FullName,
    file.GetType().Name );
 
  using( StreamReader reader = new StreamReader( file.OpenRead() ) )
  {
    string line;
 
    while( ( line = reader.ReadLine() ) != null )
    {
      Console.WriteLine( line );
    }
  }
 
  Console.WriteLine();
}

// Displaying the contents of those files is always the same!
DisplayTextFile( first );
DisplayTextFile( second );
DisplayTextFile( third );
DisplayTextFile( fourth );
DisplayTextFile( fifth );
DisplayTextFile( sixth );

And why not finish this demonstration by deleting the files we just created.

// And finally, deleting files is the same!
second.Delete();
third.Delete();
fourth.Delete();
fifth.Delete();
sixth.Delete();

Any kind of file is an AbstractFile. Any kind of folder is an AbstractFolder. This way, a DiskFile, an IsolatedFile, a ZippedFile and a MemoryFile share a common set of properties and methods for accessing their metadata and reading/writing their actual data. And a DiskFolder, an IsolatedFolder, a ZippedFolder and a MemoryFolder share a common set of methods for discovering child items.

How does the FTP FileSystem come into play? Simply by offering the same abstraction over files and folders stored on an FTP server. We could simply add this code to the above sample, and everything works just as expected!

AbstractFile seventh = new FtpFile( 
  new FtpConnection( "localhost" ), @"\RemoteFileSystem.txt" );
 
first.CopyTo( seventh, true );
 
DisplayTextFile( seventh );
 
seventh.Delete();

Let's dive a little bit into this implementation of an AbstractFile and AbstractFolder. Each constructor requires an FtpConnection instance, which contains information on how to connect to the target FTP server. Though it looks like a simple information storage class, it does much more. Each time an FtpFile or an FtpFolder requires information, or an incoming or outgoing stream on a file's data, it asks the FtpConnection for an active command channel connection to the server. This way, a unique command channel is generally required for accessing many files on the server.

using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) )
{
  connection.TraceWriter = Console.Out;
 
  FtpFolder root = new FtpFolder( connection );
 
  foreach( AbstractFile file in root.GetFiles( false, "*.txt" ) )
  {
    DisplayTextFile( file );
  }
}

If we comment out the "Console.WriteLine( line );" line in "DisplayTextFile", we can see the FTP conversation that occurred for the above code:

Connected to 66.46.177.250:21 on 9/27/2005 @ 2:24:54 PM
< 220 Serv-U FTP Server v6.0 for WinSock ready...
> USER anonymous
< 331 User name okay, please send complete E-mail address as password.
> PASS *****
< 230 User logged in, proceed.
> PWD
< 257 "/" is current directory.
> CWD /
< 250 Directory changed to /
> CWD /
< 250 Directory changed to /
> CWD /
< 250 Directory changed to /
> TYPE A
< 200 Type set to A.
> PASV
< 227 Entering Passive Mode (66,46,177,250,6,238)
> LIST
Data connection established with 66.46.177.250:1774 on 9/27/2005 @ 2:24:56 PM
< 150 Opening ASCII mode data connection for /bin/ls.
< 226-Maximum disk quota limited to Unlimited kBytes
< Used disk quota 0 kBytes, available Unlimited kBytes
< 226 Transfer complete.
Displaying the contents of \FileSystem.txt, which is a FtpFile.
> CWD /
< 250 Directory changed to /
> TYPE I
< 200 Type set to I.
> PASV
< 227 Entering Passive Mode (66,46,177,250,6,240)
> RETR FileSystem.txt
Data connection established with 66.46.177.250:1776 on 9/27/2005 @ 2:24:57 PM
< 150 Opening BINARY mode data connection for FileSystem.txt (1198 Bytes).
< 226-Maximum disk quota limited to Unlimited kBytes
< Used disk quota 0 kBytes, available Unlimited kBytes
< 226 Transfer complete.

Displaying the contents of \appnote.txt, which is a FtpFile.
> CWD /
< 250 Directory changed to /
> TYPE I
< 200 Type set to I.
> PASV
< 227 Entering Passive Mode (66,46,177,250,6,246)
> RETR appnote.txt
Data connection established with 66.46.177.250:1782 on 9/27/2005 @ 2:24:58 PM
< 150 Opening BINARY mode data connection for appnote.txt (109785 Bytes).
< 226-Maximum disk quota limited to Unlimited kBytes
< Used disk quota 0 kBytes, available Unlimited kBytes
< 226 Transfer complete.

> QUIT
Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:25:04 PM

Each FtpFolder and FtpFile instance shared the same FtpConnection, and since no two operations were done at the same time, a single connection was required, as the log indicates. The FtpConnection object implements the IDisposable interface, since it keeps any active connection available until disposed (or finalized).

Now what happens if I try to open two files at the same time, like this?

using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) )
{
  connection.TraceWriter = Console.Out;
 
  AbstractFile first = new FtpFile( connection, @"\FileSystem.txt" );
  AbstractFile second = new FtpFile( connection, @"\appnote.txt" );
 
  using( Stream firstStream = first.OpenRead() )
  {
    using( Stream secondStream = second.OpenRead() )
    {
      // In an FTP conversation with an FTP server, only one command
      // at a time can be pending. Here, we clearly have two files
      // open at the same time on the same FTP server. How? Each file
      // has its own connection to the FTP server!
    }
  }
}

The FtpConnection object will create extra command channel connections as required. The output shows two command channel connections were made:

Connected to 66.46.177.250:21 on 9/27/2005 @ 2:38:06 PM
< 220 Serv-U FTP Server v6.0 for WinSock ready...
> USER anonymous
< 331 User name okay, please send complete E-mail address as password.
> PASS *****
< 230 User logged in, proceed.
> CWD /
< 250 Directory changed to /
> TYPE A
< 200 Type set to A.
> PASV
< 227 Entering Passive Mode (66,46,177,250,10,53)
> LIST
Data connection established with 66.46.177.250:2613 on 9/27/2005 @ 2:38:07 PM
< 150 Opening ASCII mode data connection for /bin/ls.
< 226-Maximum disk quota limited to Unlimited kBytes
<     Used disk quota 0 kBytes, available Unlimited kBytes
< 226 Transfer complete.
> CWD /
< 250 Directory changed to /
> TYPE I
< 200 Type set to I.
> PASV
< 227 Entering Passive Mode (66,46,177,250,10,54)
> RETR FileSystem.txt
Data connection established with 66.46.177.250:2614 on 9/27/2005 @ 2:38:08 PM
< 150 Opening BINARY mode data connection for FileSystem.txt (1198 Bytes).
Connected to 66.46.177.250:21 on 9/27/2005 @ 2:38:08 PM
< 220 Serv-U FTP Server v6.0 for WinSock ready...
> USER anonymous
< 226-Maximum disk quota limited to Unlimited kBytes
<     Used disk quota 0 kBytes, available Unlimited kBytes
< 226 Transfer complete.
< 331 User name okay, please send complete E-mail address as password.
> PASS *****
< 230 User logged in, proceed.
> CWD /
< 250 Directory changed to /
> TYPE I
< 200 Type set to I.
> PASV
< 227 Entering Passive Mode (66,46,177,250,10,55)
> RETR appnote.txt
Data connection established with 66.46.177.250:2615 on 9/27/2005 @ 2:38:09 PM
< 150 Opening BINARY mode data connection for appnote.txt (109785 Bytes).
< 426-Maximum disk quota limited to Unlimited kBytes
<     Used disk quota 0 kBytes, available Unlimited kBytes
< 426 Data connection closed, file transfer appnote.txt aborted.
> QUIT
Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:38:13 PM
> QUIT
Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:38:13 PM

And the great part about all this is that you don't have to worry about this while coding. You're just manipulating yet another kind of AbstractFile or AbstractFolder.

If we get back to the Zip implementation of the Xceed FileSystem, you can see that a ZippedFile or ZippedFolder (or ZipArchive, the root ZippedFolder) constructor needs to know which AbstractFile is holding the actual zip file that should contain this file or folder. "AbstractFile" truly means "any file", as long as there is an AbstractFile-derived class somewhere to expose this file. It means that zipping directly onto an FTP server is no more difficult than zipping in a regular file on disk.

AbstractFolder source = new DiskFolder( @"d:\Data" );
 
AbstractFolder localDest = new ZipArchive( 
  new DiskFile( @"d:\temp\local.zip" ) );
 
AbstractFolder remoteDest = new ZipArchive(
  new FtpFile( new FtpConnection( "localhost" ), @"remote.zip" ) );
 
// Copying is the same, no matter what is the destination
// file or folder.
source.CopyTo( localDest, true );
source.CopyTo( remoteDest, true );

Code for zipping in "D:\temp\local.zip" is no different than code for zipping in "ftp://localhost/remote.zip". And obviously, reading or unzipping from any zip file is the same.

AbstractFolder localSource = new ZipArchive( 
  new DiskFile( @"d:\temp\local.zip" ) );
 
AbstractFolder remoteSource = new ZipArchive(
  new FtpFile( new FtpConnection( "localhost" ), @"remote.zip" ) );
 
AbstractFolder dest = new DiskFolder( @"d:\restored" );
 
// Unzipping text files from any source is the same!
localSource.CopyFilesTo( dest, true, true, "*.txt" );
remoteSource.CopyFilesTo( dest, true, true, "*.txt" );

I really hope this new addition to the Xceed FileSystem will generate the same enthusiasm we had inventing and developping it. I'm very interested in hearing your opinions!


.NET | FileSystem | FTP | Zip

10/5/2005 3:09:39 PM (Eastern Daylight Time, UTC-04:00)  #   
 Thursday, July 14, 2005

I'm currently working hard on the FileSystem support for Xceed FTP for .NET, and I had to share with you a screenshot of one of my tests. My XceedBox.NET sample is now fully supporting FtpFile and FtpFolder mapping.

XceedBoxNET.png

As you can see, my working folder is a zip file located in a folder on an FTP server. The "type" command above has no idea what kind of file it is displaying. It receives an AbstractFile, gets a Stream on it via OpenRead, reads and displays the contents. That AbstractFile is actually a ZippedFile found in a ZippedFolder constructed around an FtpFile found in an FtpFolder.


.NET | FileSystem | FTP

7/14/2005 4:04:21 PM (Eastern Daylight Time, UTC-04:00)  #   
 Thursday, June 09, 2005

In the new package v1.2.5309 which will be available for download next week resides a new feature you won't see much emphasis about, but which I was very eager to complete. You can now create a ZipArchive instance around an AbstractFile that does not support reading from.

(drum roll) ... (looking around) ... Nobody's applauding? That's because you probably don't know yet how useful this can be.

Most ASP.NET applications that wish to create zip files on the fly and send them in the response are either stuck with creating those zip files on disk in a temporary filename, or create them in a MemoryFile, then copy that MemoryFile in the response stream.

However, the StreamFile class was created for such purposes of exposing any existing Stream as an AbstractFile. You already could create a StreamFile around the Response's OutputStream. But passing that StreamFile to the ZipArchive's constructor would fail, because it can't read from it. Instead of assuming an empty zip file, it miserably failed. Shame.

No more... Since version 2.2.5302, it will assume the zip file is empty. So code like this works perfectly:

    public void ProcessRequest(HttpContext context)
    {
      context.Response.ContentType = "application/zip";
      context.Response.AddHeader( "Content-Disposition", "attachment; filename=images.bmp" );
 
      ZipArchive archive = new ZipArchive( new StreamFile( context.Response.OutputStream ) );
      DiskFolder source = new DiskFolder( context.Request.MapPath( "." ) );
 
      source.CopyFilesTo( archive, false, false, "*.bmp" );
    }

The same problem appeared when trying to combine Xceed Zip for .NET with Xceed FTP for .NET, to upload zip files directly on the FTP server. Though the FtpClient class exposes a very useful GetUploadStream method to get a direct stream on the data connection, code like this previously failed.

          using( Stream upload = client.GetUploadStream( "images.zip" ) )
          {
            ZipArchive archive = new ZipArchive( new StreamFile( upload ) );
            DiskFolder source = new DiskFolder( @"d:\images\" );
 
            source.CopyFilesTo( archive, false, false, "*.bmp" );
          }

Talk about short and sweet uploads of zip files!


.NET | FileSystem | FTP | Zip

6/9/2005 2:56:43 PM (Eastern Daylight Time, UTC-04:00)  #   
 Tuesday, March 08, 2005

Warning: Do not try this at home!

A few days ago, Pierre-Luc at support asked me if Xceed Zip for .NET was thread safe. I knew from his look that he was expecting a "yes" or a "no". At least, that's what the client who asked him the same question expected.

My first answer was more in nuances: Though the library was made to be safely accessible from multiple threads at the same time, by the nature of the sequential format of the zip file, it is not possible to work on the same zip file from multiple threads.

He nodded with approbation, confirming me his client wasn't trying such crazy action, but simply dealing with a multi-threaded application where each thread may be zipping in its own private file. I gave him my benediction: In that case, yes, Xceed Zip for .NET is thread safe.

Pierre-Luc wasn't two feet away when I was illuminated by an idea. It wouldn't be that crazy to try zipping into the same zip file from multiple threads. How neat would it be to benefit from multi processor or hyperthreading machines for zipping a single file? Guess what... you can! You shouldn't... but you can! Don't ask us to support this scenario... but you can!

Here's the deal. Any ZipArchive you modify gets updated when the last modify operation occurs. If you know you're about to make more than one modification to a single zip file, you should first call BeginUpdate, do all modifications, and finally call EndUpdate. The zip file will only get rebuilt on that final call. The files you copy into the zip file before EndUpdate will be compressed and stored in temp files within the ZipArchive's TempFolder.

That means any copy operation you perform within a BeginUpdate/EndUpdate block are atomic, and only involve compressing the sources into independant temp files. You see where I'm heading? How about spawning threads within that block, each thread copying its own source, and waiting for all threads to finish before calling EndUpdate?

I had to try it. I started with a class implementing IAsyncResult, which would be managing the copy operation on a separate thread:

  internal class AsyncCopyResult : IAsyncResult

  {

    public AsyncCopyResult(

      AbstractFolder source,

      AbstractFolder dest,

      AsyncCallback callback,

      object state )

    {

      m_source = source;

      m_dest = dest;

      m_callback = callback;

      m_state = state;

 

      m_thread = new Thread( new ThreadStart( this.ThreadProc ) );

      m_completed = new ManualResetEvent( false );

    }

 

    public void Begin()

    {

      m_completed.Reset();

      m_thread.Start();

    }

 

    public void End()

    {

      // We must not join thread since we may get called by callback, itself

      // within thread.

      m_completed.WaitOne();

 

      if( m_result != null )

        throw m_result;

    }

 

    #region IAsyncResult IMPLEMENTATION

 

    public object AsyncState

    {

      get { return m_state; }

    }

 

    public bool CompletedSynchronously

    {

      get { return false; }

    }

 

    public WaitHandle AsyncWaitHandle

    {

      get { return m_completed; }

    }

 

    public bool IsCompleted

    {

      get { return m_completed.WaitOne( 1, false ); }

    }

 

    #endregion

 

    private void ThreadProc()

    {

      try

      {

        m_result = null;

 

        if( m_source == null )

          throw new ArgumentNullException( "source" );

 

        if( m_dest == null )

          throw new ArgumentNullException( "dest" );

 

        if( m_source.IsRoot )

        {

          m_source.CopyFilesTo( m_dest, true, true );

        }

        else

        {

          m_source.CopyTo( m_dest, true );

        }

      }

      catch( Exception except )

      {

        m_result = except;

      }

 

      m_completed.Set();

 

      if( m_callback != null )

      {

        try

        {

          // Important note: This callback may be calling End.

          // Thus End's implementation should not wait for thread,

          // but for handle.

          m_callback( this );

        }

        catch

        {

          System.Diagnostics.Debug.WriteLine( "Unhandled exception within callback." );

        }

      }

    }

 

    private Thread m_thread = null;

    private ManualResetEvent m_completed = null;

 

    private AbstractFolder m_source = null;

    private AbstractFolder m_dest = null;

    private AsyncCallback m_callback = null;

    private object m_state = null;

 

    private Exception m_result = null;

  }

The ThreadProc method is simply copying the source folder into the destination folder. The rest is plumbing for implementing IAsyncResult. In my main class, I created a static method that uses the AsyncCopyResult class like this:

    private static void CopyMultipleFolders(

      AbstractFolder[] sources,

      AbstractFolder dest )

    {

      // I'm using AutoBatchUpdate with the using directive, an easy way

      // to call BeginUpdate and EndUpdate only if the folder implements

      // IBatchUpdateable.

      using( AutoBatchUpdate auto = new AutoBatchUpdate( dest ) )

      {

        AsyncCopyResult[] results = new AsyncCopyResult[ sources.Length ];

 

        // First create the threads and state objects.

        for( int i=0; i<sources.Length; i++ )

        {

          results[ i ] = new AsyncCopyResult( sources[ i ], dest, null, null );

        }

 

        // Then launch each thread

        foreach( AsyncCopyResult result in results )

        {

          result.Begin();

        }

 

        // We can't call WaitAll on an STA thread, but it doesn't matter.

        // We wait for each one separately.

        foreach( AsyncCopyResult result in results )

        {

          result.AsyncWaitHandle.WaitOne();

 

          try

          {

            result.End();

          }

          catch( Exception except )

          {

            Console.WriteLine( except.Message );

          }

        }

      }

    }

Once each thread is done copying its own source folder into the destination folder, the AutoBatchUpdate class calls EndUpdate on the destination folder (if it implements IBatchUpdateable). In the case of a zip file destination, the final zip file is built by reassembling already compressed temp files. Here's an example of how to call CopyMultipleFolders:

        AbstractFile target = new DiskFile( @"d:\temp\multi.zip" );

 

        if( target.Exists )

          target.Delete();

 

        ZipArchive zip = new ZipArchive( target );

        AbstractFolder firstSource = new DiskFolder( @"d:\Downloads" );

        AbstractFolder secondSource = new DiskFolder( @"d:\Music" );

 

        CopyMultipleFolders( new AbstractFolder[] { firstSource, secondSource }, zip );

The best thing is that this method works for any kind of AbstractFolder, source or destination. If you're confident the size of the zip file isn't too large, you can improve performance by setting the ZipArchive's TempFolder to a new MemoryFolder.

But remember: Don't try this! I didn't tell you it was possible.



3/8/2005 10:56:48 AM (Eastern Standard Time, UTC-05:00)  #   
 Wednesday, February 23, 2005

I had to take a look at an issue with a spanned zip file. The client graciously sent us the set of zip files he had on floppies. He copied each floppy in subfolders named "Disk1", "Disk2", "Disk3" and "Disk4", zipped those folders and sent us the resulting single huge zip file. Pretty standard.

I was about to copy each part back on floppies when I realized I was passing by a huge feature Xceed Zip for .NET offers regarding spanning and splitting: the ability to specify any AbstractFile, wherever it's located, when a new "disk" is required.

Before I show you how I skipped the unpleasant task of copying each part on separate floppies, let's first take a look at how you can support traditional spanning using Xceed Zip for .NET. The code below is what I call a minimum spanning implementation:

private static void Zip( AbstractFile zipFile, AbstractFolder sourceFolder )
{
  // Prepare ZipEvents object that will be passed to every method call.
  ZipEvents events = new ZipEvents();
  events.DiskRequired += new DiskRequiredEventHandler( ZipEvents_DiskRequired );
 
  // Create the target ZipArchive, and prepare for batch modifications.
  ZipArchive zip = new ZipArchive( events, null, zipFile );
  zip.BeginUpdate( events, null );
 
  try
  {