Thursday, November 24, 2005

I must admit, I'm no database wizz. It's been a looooong time since I've played with SQL, and I never really digged into System.Data. The rare moments I required a data connection, the design-time experience was enough, oh, and that Fill call on that adapter! q;-)

Recently, a customer explained he was using an SqlDataReader to fetch its data on demand, to avoid loading too much. One of the fields was a byte array containing a GZip file. He was using the static method GZipCompressedStream.Decompress to get the data in that GZip file. Unfortunately, the data was sometimes quite large, and this technique prevented him from using SqlDataReader.GetBytes (exposed in the IDataRecord interface), which allows to read only chunks of a field at a time.

Bummer... My "Streams everywhere" modo was challenged. You see, Xceed Streaming Compression for .NET allows you to either decompress a single byte array in one operation (stateless), or wrap a GZipCompressedStream around your source stream and read from it to decompress data on the fly. But in this case, no streams. Nada. Or if there is one, I didn't find it.

I was not going to get defeated by that mere absence. The "Streaming" in "Xceed Streaming Compression for .NET" is exactly about that scenario. It turns out it was quite easy to overcome this little problem. I created myself a DataRecordStream class, which derives from System.IO.Stream.

Apart from the usual overrides required when deriving from System.IO.Stream, I expose a constructor requiring an IDataRecord parameter and the index of the field to expose as a stream.

    public DataRecordStream( IDataRecord record, int fieldIndex )
{
if( record == null )
throw new ArgumentNullException( "record" );

if( ( fieldIndex < 0 ) || ( fieldIndex >= record.FieldCount ) )
throw new ArgumentOutOfRangeException( "fieldIndex", fieldIndex, "Invalid field index." );

m_record = record;
m_field = fieldIndex;
}

Then, when Read is called, I simply turn to my IDataRecord's GetBytes method to fill that buffer.

    public override int Read( byte[] buffer, int offset, int count )
{
long read = m_record.GetBytes( m_field, m_position, buffer, offset, count );

m_position += read;

if( ( m_length != -1 ) && ( m_position > m_length ) )
{
// The reported length was smaller than the actual size.
// We update the length dynamically.
m_length = m_position;
}

return unchecked( ( int )read );
}

The rest is just glue for managing the position and allowing seeking. The good part about this new class is that you can now wrap any pass-thru stream around it, for example a GZipCompressedStream. My customer can now read text compressed in a GZip file stored in one of its database fields quite easily, without consuming too much memory.

      SqlConnection connection = new SqlConnection( 
"integrated security=SSPI;data source=xxx;initial catalog=GZipTest" );

connection.Open();

try
{
using( SqlCommand command = new SqlCommand( "SELECT * FROM GZipTestTable", connection ) )
{
using( SqlDataReader dataReader = command.ExecuteReader() )
{
while( dataReader.Read() )
{
using( StreamReader textReader = new StreamReader(
new GZipCompressedStream(
new DataRecordStream(
dataReader, dataReader.GetOrdinal( "GZipField" ) ) ) ) )
{
string line;

while( ( line = textReader.ReadLine() ) != null )
{
Console.WriteLine( line );
}
}
}
}
}
}
finally
{
connection.Close();
}

I have made a VB.NET version of the class too. Enjoy!

DataRecordStream.zip (3.34 KB)



11/24/2005 9:25:04 AM (Eastern Standard Time, UTC-05:00)  #