|
 Wednesday, April 05, 2006
It's been a long time since I've posted something "chewable". My energies were all directed on the soon to be released Xceed Zip for .NET version 3.0. Though the apparent changes on the public interface are quite minor, and one can teach the new classes quite easily, the underlying code wasn't trivial.
In short, for all of you who know what ZipArchive, ZippedFile and ZippedFolder are, say hello to TarArchive, TarredFile, TarredFolder, GZipArchive and GZippedFile.
And when I say "easy to teach", what it really means is "find yourself a zipping example, and replace class name occurances of Zip{something} with Tar{somethong} or GZip{something}".
Sure, there are some gotchas, like the fact that a GZIP archive cannot contain filenames with subfolders, are not well-suited to contain more than one compressed file, and can contain files without filenames. But these are details you'll get used to quite easily.
There are two things that make me really proud in that product. One is under the hood, and the other is a sample. First, the engine: My colleague Jacques and I have come up with what we call the "Storage Engine". It's an abstraction of what an archiving library needs in term of temp storage, in-place archive updating, and transactional operations on an archive. Both the new TAR and GZIP implementations use it. In short, it abstracts the fact that we want to always update an archive in-place when possible, but revert to temp files and make sure to commit those temp files with any existing archive upon the last modification of it. If things go well, the ZIP implementation will benefit from it sooner than later.
Second, the sample: The FTP Sample Explorer is gone, replaced with the FileSystem Snippet Explorer, a sample that let's you see, modify and run code snippets that show you the various tasks one might wish to implement. It goes straight to the point. No bells and whistle, no gravy, just the meat. The code is embedded in the executable as compressed serialized XML data. The main information (each topic's description and code) is nothing else than rich text. The nice thing about this sample is that in order for me to modify and add new topics, I simply need to compile the project with an extra define, and I'm now running the application in "admin" mode, enabling me to update the compressed XML file directly, for the next compilation to benefit from this update.
Though I've finished work on this 3.0 version, I already have both hands in the two next releases of Xceed Zip for .NET and Xceed FTP for .NET. The first one will add support for AES encryption, and the second one will now offer proxy support.
I didn't have much time to write because all those releases have a tight schedule I can't bust. I'm leaving Xceed in two months. Yup, I've decided it was time for me to move on. Until then, I have agreed to complete AES implementation, help Jacques kick-start proxy support and train about everybody here, each earning one of the numerous hats I'm wearing. It was a very difficult decision, since I have only friends at Xceed. Though the nine years or so I've spent here were exciting and challenging, I feel it's time for me to try new stuff... by myself. This isn't a divorce. I won't be far from Xceed, and still available to help them from time to time. As for you, dear customers and readers, rest assured you will stay in good hands. The team behind Xceed Zip and Xceed FTP, both .NET and ActiveX, will remain strong, even get stronger than it is now.
|
|
 Tuesday, January 31, 2006
Ever since I've been working with the .NET framework, most of my time was spent on the System.IO namespace. I'm not a UI guy, I'm an IO guy! The most important class in that namespace is System.IO.Stream. And since it was well-designed, probably inspired by other successful stream implementations (Delphi comes to mind), it's very easy to expose features using streams.
My favorite use of streams is for pass-through streams. A pass-through stream is a class which derives from System.IO.Stream, but reads from or writes to an inner stream received at creation. It serves as a data modifyer or data analyser. When reading from a pass-through stream, it first reads from its inner stream, then processes the data read (potentially modifying it) and returns this data. When writing to a pass-through stream, it first processes the provided data (again potentially modifying it), then writes it to its inner stream.
Xceed Zip for .NET and Xceed FTP for .NET both use a pletoria of pass-through streams. The most popular is Xceed.Compression.CompressedStream, the stream responsible for compressing data before writing it to its inner stream, or decompressing data read from its inner stream. But most others are internal. We've been juggling with the idea of exposing them for a long time, but beleive it would only confuse developers to "see" those new namespaces and classes. Another useful thing with internal classes is that we can change their interface without causing breaking changes.
TransientStream
It was a long debate before we decided to go forth with the "transient" keyword. Not only is it used in the TransientStream type name, but also as a property on many of our pass-through streams. What we meant by "transient" is "volatile", or if you prefer more explicit keywords, "does-not-close-its-inner-stream-when-closed". A TransientStream is about the simplest expression of a pass-through stream. All required property and method overrides simply call the inner stream. The only exception is for the Close method, which simply makes sure not to call Close on the inner stream. This is very useful when you need to pass your stream to another routine which closes the stream, while you don't want your stream to get closed.
ChecksumStream
This stream does not modify the data read from or written to, but takes the opportunity to calculate either a CRC32 or an Adler32 on that data. When reading, it can also make sure, upon closing it, that the calculated checksum matches an expected stream, else throw an exception. In this way, we can insert checksum calculation anywhere in a process without interfering nor requiring code changes.
CombinedStream
The deflate compression algorithm has the ability to detect the end of the data when decompressing. The CompressedStream is itself a pass-through stream. When reading from it, it first reads from the inner stream, then decompresses the data. When it reaches the end of the compressed data, the CompressedStream has the ability to return a stream on the remaining data, in case this inner stream contains more data after the compressed block. Why isn't this equivalent to the inner stream you might ask? Let's say the inner stream isn't seekable. The CompressedStream's Read method first reads N bytes from the inner stream, but may have found that the end of the compressed data is after M bytes (M < N). The inner stream is already N-M bytes too far. The CombinedStream receives both a byte array (the unused N-M bytes) and the inner stream as ctor parameters, and will expose those as one contiguous stream. Pretty slick!
HeaderFooterStream
Xceed Streaming Compression for .NET exposes stream-based (as opposed to archive-based) compression formats. Those formats all have one thing in common: they have a header and a footer. Not all of them can depend on the deflate algorithm to automatically detect the end of the stream. That's why they need to make sure to never return the first M bytes and last N bytes from the inner stream, where M is the expected header size and N the expected footer size.
WindowStream
When exposing part of a zip file as a single AbstractFile, we need to make sure we do not read past the boundaries of that file's data in the zip file. The WindowStream exposes a region of its inner stream as a zero-position, N-length stream.
ZCryptStream
This pass-though stream automatically encrypts or decrypts the data written or read, using the basic Zip encryption (which is as weak as me in front of a cheese cake). I will be working on AES encryption very soon, and it will most probably be implemented as a pass-through stream too!
NotifyStream
Though pass-through streams can do much of the task, it is often better for the clarity of the code to have processing done by other classes not deriving from System.IO.Stream. The NotifyStream class exposes three events: ReadingFromStream, WritingToStream and ClosingStream. Any other class can advise for those events to intervene in the reading or writing process. This old class exists since the beginning of Xceed Zip for .NET, but it has proven very useful in the current development we are doing for Tar and GZip support within Xceed Zip for .NET.
ForwardSeekableStream
This new class created for Xceed Zip for .NET 3.0 (Tar and GZip support) can expose a non-seekable stream as a seekable stream when reading, or at least a stream reporting a Position when writing. When reading, you can call Seek with an offset behond the current position, and it will simply read from the non-seekable inner stream until well positioned. And for both reading and writing operations, it counts the number of bytes read or written so it can report a position (granting we knew the original position when created).
FtpAsciiDataStream
Xceed FTP for .NET also uses pass-through streams. For example, the FtpAsciiDataStream wraps the NetworkStream to perform convertion of LF to CR/LF on the fly when sending a file in ASCII mode.
|
|
 Friday, November 11, 2005
I previously gave a glimpse of how to zip into an HttpResponse's OutputStream, but it wasn't explaining all aspects of zipping from ASP.NET. So I'll get in more details here.
First, I have used my fantastic talent in UI designs to create this web page:

Yup, three checkboxes and a button is enough gadgets for me!
The first piece of code involves Application_Start. Since I know I won't be zipping gazillions of bytes, I want my web page to use memory as a temporary location for compressed data. How you do this with Xceed Zip for .NET is simple: You create a RAM drive! Oh the good old days of RAM drives...
protected void Application_Start(Object sender, EventArgs e)
{
Xceed.Zip.Licenser.LicenseKey = "ZIN23-#####-#####-####";
ZipArchive.DefaultTempFolder = new MemoryFolder();
}
This new MemoryFolder is acting exactly like a per-process RAM drive. It's an AbstractFolder like any other AbstractFolder. The TempFolder of all new ZipArchive instances will be initialized to that value. Application_Start is also a great place where to set your license key, before anything else.
We're now ready for the button's click event. Again, I want to avoid write access on the hard drive, and wish to zip directly in the response stream. But the idea behind the Xceed FileSystem is to copy source files and folders to destination files and folders. How can I zip into a Stream? The StreamFile class comes to the rescue. It lets you expose a Stream as if it were an AbstractFile. Then, you can pass this StreamFile to the ZipArchive's constructor, to tell that new instance to write into that Stream. The rest is glue code for my wonderful ASP.NET application to zip the correct files.
private void Button1_Click(object sender, System.EventArgs e)
{
if( !CheckBox1.Checked && !CheckBox2.Checked && !CheckBox3.Checked )
{
// Redirect to error page...
return;
}
// The "MACHINE\ASP_NET" user must have read access to that folder.
DiskFolder source = new DiskFolder( @"d:\" );
// We want the client-side to recognize the upcoming file as a zip file.
this.Response.ContentType = "application/zip";
this.Response.AddHeader( "Content-Disposition", "attachment; filename=YourFiles.zip" );
// We will zip directly in the response stream. The temporary compressed
// data will be written to the ZipArchive's TempFolder, thus the MemoryFolder
// we set in Application_Start.
ZipArchive destination = new ZipArchive( new StreamFile( this.Response.OutputStream ) );
// And finally we zip in a single operation. If we had to zip more than
// one source, we could have used ZipArchive.BeginUpdate/EndUpdate.
ArrayList nameFilters = new ArrayList();
if( CheckBox1.Checked )
nameFilters.Add( new NameFilter( "*.txt" ) );
if( CheckBox2.Checked )
nameFilters.Add( new NameFilter( "*.jpg" ) );
if( CheckBox3.Checked )
nameFilters.Add( new NameFilter( "*.exe|*.dll" ) );
// Passing more than one filter to CopyFilesTo does an "AndFilter"
// by default.
Filter mainFilter = ( nameFilters.Count == 1 )
? nameFilters[ 0 ] as Filter
: new OrFilter( nameFilters.ToArray( typeof( NameFilter ) ) );
source.CopyFilesTo( destination, false, true, mainFilter );
this.Response.End();
}
We now have an ASP.NET application which only requires read access to the source files and folders to zip. Everything else is done in memory, without drifting away from the logic of the Xceed FileSystem; manipulating files and folders.
|
|
 Tuesday, November 01, 2005
Just in case my previous post on the subject did not ring a bell, the release of version 2.1 of Xceed FTP for .NET means you can directly unzip from a zip file located on an FTP server, without downloading the file first! Look at the following code:
using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) )
{
FtpFile source = new FtpFile( connection, @"/images/Flowers/Backup/Flowers.zip" );
DiskFolder dest = new DiskFolder( @"d:\temp\flowers" );
ZipArchive zip = new ZipArchive( source );
zip.CopyFilesTo( dest, true, true );
}
The secret behind this code is the kind of stream "FtpFile.OpenRead" returns. Though we are dealing with a network connection, this stream is fully seekable! The FtpFile takes advantage of the "REST" FTP command, which tells the FTP server we wish to start the transfer at a specific offset. Thus, when the ZipArchive needs to seek at the end of the file to locate the ending header, a proper "REST" command is issued to avoid having to read all the zip file first. And the same happens when reading the central directory, or unzipping specific files.
|
|
 Wednesday, October 05, 2005
I'm glad to announce that Xceed FTP for .NET 2.1 is now available for download.
I've been working on this release for the past few months, and I'm very
excited to finally see the FTP FileSystem come to life.
For those not familiar with the Xceed FileSystem, which comes with
Xceed Zip for .NET, here is some code that sheds some light on what you
can do with it. Consider these variables of the same base type:
// A file on disk AbstractFile first = new DiskFile( @"d:\FileSystem.txt" ); // Another file on disk AbstractFile second = new DiskFile( @"c:\temp\AnotherFileSystem.txt" ); // A file compressed in a zip file on disk AbstractFile third = new ZippedFile( new DiskFile( @"c:\temp\data.zip" ), "FileSystemInAZip.txt" );; // A file in the isolated storage AbstractFile fourth = new IsolatedFile( "Isolated.txt" ); // A file in memory (random name) AbstractFile fifth = new MemoryFile(); // A file compressed in a zip file in memory AbstractFile sixth = new ZippedFile( new MemoryFile(), "VolatileFileSystem.txt" );
You can copy files around very easily:
// Copying the first file anywhere else is always the same! first.CopyTo( second, true ); first.CopyTo( third, true ); first.CopyTo( fourth, true ); first.CopyTo( fifth, true ); first.CopyTo( sixth, true );
And accessing the contents of any file is always the same:
private void DisplayTextFile( AbstractFile file ) { Console.WriteLine( "Displaying the contents of {0}, which is a {1}.", file.FullName, file.GetType().Name ); using( StreamReader reader = new StreamReader( file.OpenRead() ) ) { string line; while( ( line = reader.ReadLine() ) != null ) { Console.WriteLine( line ); } } Console.WriteLine(); }
// Displaying the contents of those files is always the same! DisplayTextFile( first ); DisplayTextFile( second ); DisplayTextFile( third ); DisplayTextFile( fourth ); DisplayTextFile( fifth ); DisplayTextFile( sixth );
And why not finish this demonstration by deleting the files we just created.
// And finally, deleting files is the same! second.Delete(); third.Delete(); fourth.Delete(); fifth.Delete(); sixth.Delete();
Any kind of file is an AbstractFile. Any kind of folder is an
AbstractFolder. This way, a DiskFile, an IsolatedFile, a ZippedFile and
a MemoryFile share a common set of properties and methods for accessing
their metadata and reading/writing their actual data. And a DiskFolder,
an IsolatedFolder, a ZippedFolder and a MemoryFolder share a common set
of methods for discovering child items.
How does the FTP FileSystem come into play? Simply by offering the
same abstraction over files and folders stored on an FTP server. We
could simply add this code to the above sample, and everything works
just as expected!
AbstractFile seventh = new FtpFile( new FtpConnection( "localhost" ), @"\RemoteFileSystem.txt" ); first.CopyTo( seventh, true ); DisplayTextFile( seventh ); seventh.Delete();
Let's dive a little bit into this implementation of an AbstractFile
and AbstractFolder. Each constructor requires an FtpConnection
instance, which contains information on how to connect to the target
FTP server. Though it looks like a simple information storage class, it
does much more. Each time an FtpFile or an FtpFolder requires
information, or an incoming or outgoing stream on a file's data, it
asks the FtpConnection for an active command channel connection to the
server. This way, a unique command channel is generally required for
accessing many files on the server.
using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) ) { connection.TraceWriter = Console.Out; FtpFolder root = new FtpFolder( connection ); foreach( AbstractFile file in root.GetFiles( false, "*.txt" ) ) { DisplayTextFile( file ); } }
If we comment out the "Console.WriteLine( line );" line in
"DisplayTextFile", we can see the FTP conversation that occurred for
the above code:
Connected to 66.46.177.250:21 on 9/27/2005 @ 2:24:54 PM < 220 Serv-U FTP Server v6.0 for WinSock ready... > USER anonymous < 331 User name okay, please send complete E-mail address as password. > PASS ***** < 230 User logged in, proceed. > PWD < 257 "/" is current directory. > CWD / < 250 Directory changed to / > CWD / < 250 Directory changed to / > CWD / < 250 Directory changed to / > TYPE A < 200 Type set to A. > PASV < 227 Entering Passive Mode (66,46,177,250,6,238) > LIST Data connection established with 66.46.177.250:1774 on 9/27/2005 @ 2:24:56 PM < 150 Opening ASCII mode data connection for /bin/ls. < 226-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 226 Transfer complete. Displaying the contents of \FileSystem.txt, which is a FtpFile. > CWD / < 250 Directory changed to / > TYPE I < 200 Type set to I. > PASV < 227 Entering Passive Mode (66,46,177,250,6,240) > RETR FileSystem.txt Data connection established with 66.46.177.250:1776 on 9/27/2005 @ 2:24:57 PM < 150 Opening BINARY mode data connection for FileSystem.txt (1198 Bytes). < 226-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 226 Transfer complete.
Displaying the contents of \appnote.txt, which is a FtpFile. > CWD / < 250 Directory changed to / > TYPE I < 200 Type set to I. > PASV < 227 Entering Passive Mode (66,46,177,250,6,246) > RETR appnote.txt Data connection established with 66.46.177.250:1782 on 9/27/2005 @ 2:24:58 PM < 150 Opening BINARY mode data connection for appnote.txt (109785 Bytes). < 226-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 226 Transfer complete.
> QUIT Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:25:04 PM
Each FtpFolder and FtpFile instance shared the same FtpConnection,
and since no two operations were done at the same time, a single
connection was required, as the log indicates. The FtpConnection object
implements the IDisposable interface, since it keeps any active
connection available until disposed (or finalized).
Now what happens if I try to open two files at the same time, like this?
using( FtpConnection connection = new FtpConnection( "ftp.xceed.com" ) ) { connection.TraceWriter = Console.Out; AbstractFile first = new FtpFile( connection, @"\FileSystem.txt" ); AbstractFile second = new FtpFile( connection, @"\appnote.txt" ); using( Stream firstStream = first.OpenRead() ) { using( Stream secondStream = second.OpenRead() ) { // In an FTP conversation with an FTP server, only one command // at a time can be pending. Here, we clearly have two files // open at the same time on the same FTP server. How? Each file // has its own connection to the FTP server! } } }
The FtpConnection object will create extra command channel
connections as required. The output shows two command channel
connections were made:
Connected to 66.46.177.250:21 on 9/27/2005 @ 2:38:06 PM < 220 Serv-U FTP Server v6.0 for WinSock ready... > USER anonymous < 331 User name okay, please send complete E-mail address as password. > PASS ***** < 230 User logged in, proceed. > CWD / < 250 Directory changed to / > TYPE A < 200 Type set to A. > PASV < 227 Entering Passive Mode (66,46,177,250,10,53) > LIST Data connection established with 66.46.177.250:2613 on 9/27/2005 @ 2:38:07 PM < 150 Opening ASCII mode data connection for /bin/ls. < 226-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 226 Transfer complete. > CWD / < 250 Directory changed to / > TYPE I < 200 Type set to I. > PASV < 227 Entering Passive Mode (66,46,177,250,10,54) > RETR FileSystem.txt Data connection established with 66.46.177.250:2614 on 9/27/2005 @ 2:38:08 PM < 150 Opening BINARY mode data connection for FileSystem.txt (1198 Bytes). Connected to 66.46.177.250:21 on 9/27/2005 @ 2:38:08 PM < 220 Serv-U FTP Server v6.0 for WinSock ready... > USER anonymous < 226-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 226 Transfer complete. < 331 User name okay, please send complete E-mail address as password. > PASS ***** < 230 User logged in, proceed. > CWD / < 250 Directory changed to / > TYPE I < 200 Type set to I. > PASV < 227 Entering Passive Mode (66,46,177,250,10,55) > RETR appnote.txt Data connection established with 66.46.177.250:2615 on 9/27/2005 @ 2:38:09 PM < 150 Opening BINARY mode data connection for appnote.txt (109785 Bytes). < 426-Maximum disk quota limited to Unlimited kBytes < Used disk quota 0 kBytes, available Unlimited kBytes < 426 Data connection closed, file transfer appnote.txt aborted. > QUIT Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:38:13 PM > QUIT Disconnected from 66.46.177.250:21 on 9/27/2005 @ 2:38:13 PM
And the great part about all this is that you don't have to worry
about this while coding. You're just manipulating yet another kind of
AbstractFile or AbstractFolder.
If we get back to the Zip implementation of the Xceed FileSystem,
you can see that a ZippedFile or ZippedFolder (or ZipArchive, the root
ZippedFolder) constructor needs to know which AbstractFile is holding
the actual zip file that should contain this file or folder.
"AbstractFile" truly means "any file", as long as there is an
AbstractFile-derived class somewhere to expose this file. It means that
zipping directly onto an FTP server is no more difficult than zipping
in a regular file on disk.
AbstractFolder source = new DiskFolder( @"d:\Data" ); AbstractFolder localDest = new ZipArchive( new DiskFile( @"d:\temp\local.zip" ) ); AbstractFolder remoteDest = new ZipArchive( new FtpFile( new FtpConnection( "localhost" ), @"remote.zip" ) ); // Copying is the same, no matter what is the destination // file or folder. source.CopyTo( localDest, true ); source.CopyTo( remoteDest, true );
Code for zipping in "D:\temp\local.zip" is no different than code
for zipping in "ftp://localhost/remote.zip". And obviously, reading or
unzipping from any zip file is the same.
AbstractFolder localSource = new ZipArchive( new DiskFile( @"d:\temp\local.zip" ) ); AbstractFolder remoteSource = new ZipArchive( new FtpFile( new FtpConnection( "localhost" ), @"remote.zip" ) ); AbstractFolder dest = new DiskFolder( @"d:\restored" ); // Unzipping text files from any source is the same! localSource.CopyFilesTo( dest, true, true, "*.txt" ); remoteSource.CopyFilesTo( dest, true, true, "*.txt" );
I really hope this new addition to the Xceed FileSystem will
generate the same enthusiasm we had inventing and developping it. I'm
very interested in hearing your opinions!
|
|
 Thursday, June 09, 2005
In the new package v1.2.5309 which will be available for download next week resides a new feature you won't see much emphasis about, but which I was very eager to complete. You can now create a ZipArchive instance around an AbstractFile that does not support reading from.
(drum roll) ... (looking around) ... Nobody's applauding? That's because you probably don't know yet how useful this can be.
Most ASP.NET applications that wish to create zip files on the fly and send them in the response are either stuck with creating those zip files on disk in a temporary filename, or create them in a MemoryFile, then copy that MemoryFile in the response stream.
However, the StreamFile class was created for such purposes of exposing any existing Stream as an AbstractFile. You already could create a StreamFile around the Response's OutputStream. But passing that StreamFile to the ZipArchive's constructor would fail, because it can't read from it. Instead of assuming an empty zip file, it miserably failed. Shame.
No more... Since version 2.2.5302, it will assume the zip file is empty. So code like this works perfectly:
public void ProcessRequest(HttpContext context) { context.Response.ContentType = "application/zip"; context.Response.AddHeader( "Content-Disposition", "attachment; filename=images.bmp" ); ZipArchive archive = new ZipArchive( new StreamFile( context.Response.OutputStream ) ); DiskFolder source = new DiskFolder( context.Request.MapPath( "." ) ); source.CopyFilesTo( archive, false, false, "*.bmp" ); }
The same problem appeared when trying to combine Xceed Zip for .NET with Xceed FTP for .NET, to upload zip files directly on the FTP server. Though the FtpClient class exposes a very useful GetUploadStream method to get a direct stream on the data connection, code like this previously failed.
using( Stream upload = client.GetUploadStream( "images.zip" ) ) { ZipArchive archive = new ZipArchive( new StreamFile( upload ) ); DiskFolder source = new DiskFolder( @"d:\images\" ); source.CopyFilesTo( archive, false, false, "*.bmp" ); }
Talk about short and sweet uploads of zip files!
|
|
 Thursday, May 19, 2005
Lately, people have been asking us how to abort a zipping operation with Xceed Zip for .NET. The official answer is "you can't", as there is no method or property exposed for this task, as opposed to Xceed Zip ActiveX with its simple Abort property. But the truth is you can, with relatively little coding.
Before we get into how to abort, let's talk a little bit about the ZipArchive's TempFolder property. By default, it points to the same folder as the static ZipArchive.DefaultTempFolder property, which itself points to the user's temp folder, as exposed by System.IO.Path.GetTempPath().
Though the library is designed to delete any file it creates in the temporary folder, this can occur only when instances get finalized if the operation failed in the middle of the process.
A good coding pattern I like to use is the following: ZipArchive zip = new ZipArchive( new DiskFile( @"d:\temp\backup.zip" ) ); zip.TempFolder = zip.TempFolder.CreateFolder( Guid.NewGuid().ToString() ); try { using( AutoBatchUpdate auto = new AutoBatchUpdate( zip ) ) { DiskFolder source = new DiskFolder( @"d:\Data" ); source.CopyTo( zip, true ); } } finally { zip.TempFolder.Delete(); }
This makes sure no temp file survive a zipping cycle. And with that pattern, I can set the "default" temporary location once using the static DefaultTempFolder property, and each instance will use a unique folder within this starting point.
Now that my zipping operations are cleaning their traces, we're ready to talk about aborting. Some key concepts:
-
The library isn't pumping messages, and does not offer async operations. If you want your WinForms application's "Abort" button to react, you will have to pump messages yourself somewhere.
-
There are three major operations behind the creation or modification of a zip file:
-
Compressing each new file.
-
Moving each file to keep from the original zip file (if updating an existing zip file).
-
Building the target zip file by appending data created by the above two steps.
-
Zip and FileSystem events get raised at many levels, so you should pass your ZipEvents instance everywhere an overload accepting a FileSystemEvents or ZipEvents instance exists.
Your "Abort" button (or any abort input you like) will simply raise a flag. It can't do more.
private bool m_abort = false; private void AbortButton_Click(object sender, System.EventArgs e) { m_abort = true; }
Then you handle three events matching the forementioned three steps, pump messages to keep a responsive application, and check if the flag is raised. You can safely use the same method for handling the three events.
private void CheckAbort_ByteProgression(object sender, ByteProgressionEventArgs e) { if( m_abort ) throw new ApplicationException( "The user aborted the operation." ); Application.DoEvents(); }
As you can see, if the flag is raised, I'm throwing an ApplicationException. This will result in a System.Reflection.TargetInvocationException being thrown by the originally called method. To get a well-behaved application, you obviously want to trap any exception the FileSystem could throw. You can catch any TargetInvocationException to display an "operation aborted" message. Here's my code for the full operation:
private void StartButton_Click(object sender, System.EventArgs e) { m_abort = false; StartButton.Enabled = false; AbortButton.Enabled = true; try { ZipEvents events = new ZipEvents(); // Advise for the three main events for checking abort flag. events.ByteProgression += new ByteProgressionEventHandler( CheckAbort_ByteProgression ); events.GatheringZipContentByteProgression += new GatheringZipContentByteProgressionEventHandler( CheckAbort_ByteProgression ); events.BuildingZipByteProgression += new BuildingZipByteProgressionEventHandler( CheckAbort_ByteProgression ); // What's cool with delegates is that you can separate logic from UI. events.ByteProgression += new ByteProgressionEventHandler( UpdateUI_ByteProgression ); ZipArchive zip = new ZipArchive( events, null, new DiskFile( @"d:\temp\backup.zip" ) ); zip.TempFolder = zip.TempFolder.CreateFolder( Guid.NewGuid().ToString() ); try { using( AutoBatchUpdate auto = new AutoBatchUpdate( zip, events, null ) ) { DiskFolder source = new DiskFolder( @"d:\Data" ); source.CopyTo( events, null, zip, true ); } } finally { zip.TempFolder.Delete(); // Clean up events. events.ByteProgression -= new ByteProgressionEventHandler( CheckAbort_ByteProgression ); events.GatheringZipContentByteProgression -= new GatheringZipContentByteProgressionEventHandler( CheckAbort_ByteProgression ); events.BuildingZipByteProgression -= new BuildingZipByteProgressionEventHandler( CheckAbort_ByteProgression ); events.ByteProgression -= new ByteProgressionEventHandler( UpdateUI_ByteProgression ); } } catch( System.Reflection.TargetInvocationException except ) { MessageBox.Show( except.InnerException.Message, "Abort" ); } catch( Exception except ) { MessageBox.Show( except.Message, "Error" ); } finally { AbortButton.Enabled = false; StartButton.Enabled = true; m_abort = false; } }
Things to notice:
- I'm passing my "events" instance to:
- The ZipArchive's ctor. You could handle the ReadingZipItemProgression events.
- The AutoBatchUpdate ctor, which will in turn pass it to both BeginUpdate and EndUpdate. The later method will generate the GatheringZipContentByteProgression and BuildingZipByteProgression events.
- The CopyTo method. It will generate the ByteProgression events.
- I'm advising two times for the ByteProgression events, once for handling abort conditions, and another for updating my UI. This is a cool way to leverage delegates and separate the logic from the UI.
|
|
 Tuesday, April 19, 2005
I just got bitten by the .NET Framework COM interrop. We had a problem with Xceed Zip ActiveX used in a .NET application. If the application was handling the ZipPreprocessingFile event and changed the sFilename parameter (BSTR*, or ByRef String if you wish), sometimes the library did not change the filename in the resulting zip file.
That "sometimes" was the mysterious part, though I had a good idea where the problem was. The method which fires the ZipPreprocessingFile event makes a dangerous, but up until now valid assumption. The kind of assumption that would make Raymond Chen or Don Box real mad. It took for granted that the BSTR address would change if the callee was to change the BSTR. I made this assumption based on two facts:
- A BSTR is an immutable entity. If you need to modify one, you should create a copy with the new content.
-
If the implementation of a function that takes a BSTR reference parameter assigns a new BSTR to the parameter, it must free the previously referenced BSTR. (written "as is" in MSDN)
The .NET code that reproduces the problem does a very simple thing:
sFilename = sFilename & "new"
Normally, languages will work with the provided BSTR* as is. And if a modification occurs, they will allocate the required new BSTR, copy chars from the old BSTR, then free it. The new string cannot have the same address as the old one.
In .NET, the COM interrop is actually making a copy of the BSTR to create a System.String, work with that System.String throughout the function, then checks if the string changed before returning control to the COM caller, making either a call to SysReAllocString on the old BSTR, using the String as the "psz" parameter, or simply freeing the old BSTR, then allocate the new one based on the String.
Bam! Turns out SysReAllocString or SysAllocString sometimes reallocate the new BSTR at the same address as the old one. Can't argue against that. My bad.
Three things to conclude with that:
-
You can never use what you experiment as a proof of concept. Experiments and tests are always a subset of the big picture.
-
Don't try to make assumptions larger than the initial statement. Assuming that a pointer wouldn't change just because a BSTR* parameter must be freed if changed was stretching the actual fact.
-
Optimizations may sound good, but can always introduce more problems. Simplicity is bliss.
By the way, I realized I could try a very simple VB6 test. If once the old BSTR is freed any new BSTR can end-up at the same address, does it mean that a VB6 application modifying the sFilename parameter twice can reproduce the same bug? Absolutely! My VB6 sample application did this in ZipPreprocessingFile:
sFilename = sFilename & ".foo" sFilename = sFilename & ".bar"
Turns out files are sometimes renamed, sometimes not... Sometimes, I feel like an...
|
|
 Tuesday, March 08, 2005
Warning: Do not try this at home!
A few days ago, Pierre-Luc at support asked me if Xceed Zip for .NET was thread safe. I knew from his look that he was expecting a "yes" or a "no". At least, that's what the client who asked him the same question expected.
My first answer was more in nuances: Though the library was made to be safely accessible from multiple threads at the same time, by the nature of the sequential format of the zip file, it is not possible to work on the same zip file from multiple threads.
He nodded with approbation, confirming me his client wasn't trying such crazy action, but simply dealing with a multi-threaded application where each thread may be zipping in its own private file. I gave him my benediction: In that case, yes, Xceed Zip for .NET is thread safe.
Pierre-Luc wasn't two feet away when I was illuminated by an idea. It wouldn't be that crazy to try zipping into the same zip file from multiple threads. How neat would it be to benefit from multi processor or hyperthreading machines for zipping a single file? Guess what... you can! You shouldn't... but you can! Don't ask us to support this scenario... but you can!
Here's the deal. Any ZipArchive you modify gets updated when the last modify operation occurs. If you know you're about to make more than one modification to a single zip file, you should first call BeginUpdate, do all modifications, and finally call EndUpdate. The zip file will only get rebuilt on that final call. The files you copy into the zip file before EndUpdate will be compressed and stored in temp files within the ZipArchive's TempFolder.
That means any copy operation you perform within a BeginUpdate/EndUpdate block are atomic, and only involve compressing the sources into independant temp files. You see where I'm heading? How about spawning threads within that block, each thread copying its own source, and waiting for all threads to finish before calling EndUpdate?
I had to try it. I started with a class implementing IAsyncResult, which would be managing the copy operation on a separate thread:
internal class AsyncCopyResult : IAsyncResult
{
public AsyncCopyResult(
AbstractFolder source,
AbstractFolder dest,
AsyncCallback callback,
object state )
{
m_source = source;
m_dest = dest;
m_callback = callback;
m_state = state;
m_thread = new Thread( new ThreadStart( this.ThreadProc ) );
m_completed = new ManualResetEvent( false );
}
public void Begin()
{
m_completed.Reset();
m_thread.Start();
}
public void End()
{
// We must not join thread since we may get called by callback, itself
// within thread.
m_completed.WaitOne();
if( m_result != null )
throw m_result;
}
#region IAsyncResult IMPLEMENTATION
public object AsyncState
{
get { return m_state; }
}
public bool CompletedSynchronously
{
get { return false; }
}
public WaitHandle AsyncWaitHandle
{
get { return m_completed; }
}
public bool IsCompleted
{
get { return m_completed.WaitOne( 1, false ); }
}
#endregion
private void ThreadProc()
{
try
{
m_result = null;
if( m_source == null )
throw new ArgumentNullException( "source" );
if( m_dest == null )
throw new ArgumentNullException( "dest" );
if( m_source.IsRoot )
{
m_source.CopyFilesTo( m_dest, true, true );
}
else
{
m_source.CopyTo( m_dest, true );
}
}
catch( Exception except )
{
m_result = except;
}
m_completed.Set();
if( m_callback != | |