Library code snippets

String compression

This example shows you how to compress a string. The MaxPatternLen parameter of the Compress function basically controls the compression quality(the larger the value the better quality). The amount of time this adds far out ways how much more the string will be compressed. With that in mind you should usually leave the value at 5.

Private Type Pattern
    Text As String
    TimesRepeated As Integer
    Position As Long
End Type

Private Function Compress(Text As String, Optional ByVal MaxPatternLen As Byte = 5) Dim Patterns() As Pattern Dim PatternLen As Long Dim Char As String Dim Compressed As Integer Dim ShortestPattern As Byte

    If MaxPatternLen > Len(Text) Then MaxPatternLen = Len(Text) 'this can save alot of time
    ShortestPattern = 4 + Len(Str(Len(Text)))
    If ShortestPattern > MaxPatternLen Then ShortestPattern = MaxPatternLen

    ReDim Patterns(1 To 1)
    Do Until Text = ""
PatternLoop:
        If Text = "" Then Exit Do 'Sometimes control is directed here when it shouldn't be.
        For CurPatternLen = 1 To MaxPatternLen
            If MaxPatternLen > Len(Text) Then MaxPatternLen = Len(Text) 'this can save alot of time
            Char = Left(Text, CurPatternLen)
            If Left(Text, CurPatternLen * 2) = Char & Char Then
                PatternLen = CurPatternLen
                Do Until Right(Left(Text, PatternLen + CurPatternLen), CurPatternLen) <> Char Or PatternLen = Len(Text)
                    PatternLen = PatternLen + CurPatternLen
                Loop
                   
                If PatternLen > ShortestPattern And PatternLen > 6 Then
                    ReDim Preserve Patterns(1 To UBound(Patterns) + 1)
                    Patterns(UBound(Patterns)).Text = Char
                    Patterns(UBound(Patterns)).TimesRepeated = PatternLen / CurPatternLen
                    Patterns(UBound(Patterns)).Position = Len(Compress)
                   
                    Text = Right(Text, Len(Text) - PatternLen)
                Else
                    Compress = Compress & Left(Text, PatternLen)
                    Text = Right(Text, Len(Text) - PatternLen)
                End If
                GoTo PatternLoop
            End If
        Next
        Compress = Compress & Left(Text, 1)
        Text = Right(Text, Len(Text) - 1)
    Loop
   
    For x = 1 To UBound(Patterns)
        If Patterns(x).Text <> "" Then
            Compress = Patterns(x).Text & Compress
            Compress = Patterns(x).Position & " " & Compress
            Compress = Patterns(x).TimesRepeated & " " & Compress
            Compress = Len(Patterns(x).Text & Str(Patterns(x).TimesRepeated & Patterns(x).Position)) + 2 & " " & Compress
           
            Compressed = Compressed + 1
        End If
    Next
    Compress = Compressed & " " & Compress
       
End Function
Private Function Decompress(Text As String)
Dim Patterns() As Pattern
Dim Xstr As String
    If Left(Text, InStr(Text, " ") - 1) = 0 Then
        Decompress = Right(Text, Len(Text) - 2)
        Exit Function
    End If
    ReDim Patterns(1 To Left(Text, InStr(Text, " ") - 1))
    Text = Right(Text, Len(Text) - InStr(Text, " "))
   
    For x = 1 To UBound(Patterns)
        Xstr = Left(Text, InStr(Text, " ") - 1)
        Text = Right(Text, Len(Text) - Len(Xstr))
        Xstr = Left(Text, Xstr)
        Text = Right(Text, Len(Text) - Len(Xstr))
        Xstr = Right(Xstr, Len(Xstr) - 1)
       
        Patterns(x).TimesRepeated = Left(Xstr, InStr(Xstr, " ") - 1)
        Xstr = Right(Xstr, Len(Xstr) - InStr(Xstr, " "))
        Patterns(x).Position = Left(Xstr, InStr(Xstr, " "))
        Xstr = Right(Xstr, Len(Xstr) - InStr(Xstr, " "))
        Patterns(x).Text = Xstr
        Xstr = ""
    Next
   
    'Instrt Patterns into text
    For x = 1 To UBound(Patterns)
        Xstr = ""
        For Y = 1 To Patterns(x).TimesRepeated
            Xstr = Xstr & Patterns(x).Text
        Next
        Text = Left(Text, Patterns(x).Position) & Xstr & Right(Text, Len(Text) - Patterns(x).Position)
    Next

    Decompress = Text
End Function


Comments

  1. 26 Aug 2006 at 17:56

        I don't really get it.. is this an ASP function? Because I tried running it as an ASP script and I got multiple errors...

  2. 09 Feb 2004 at 13:41

    i have a string, which does not have any blank spaces, nor does it have repetitive characters.....will this code be able to compress it further ???

  3. 24 Sep 2003 at 19:44

    You must not be useing it right.
    Are you caling the function in the format


    CompressedString = Compress(TextToCompress)

  4. 24 Sep 2003 at 19:41

    By compression quality I mean how much it compresses. The code will not alter the text from the origonal after decompression.

  5. 28 Jul 2003 at 21:38

    Do you mean how well it compresses, or how close the data will be to the original after compressing and decompressing?

  6. 06 Jun 2003 at 08:43

    Sorry Nick but as it seams your code does not work
    I have tried as input numbers, string and number with string but nothing happens

  7. 23 Jul 2002 at 12:01

    Nice code.. surprisingly enough, I can't follow it too well, but I'm sure it works great; however, I'd like to note a few things for your benefit:


    If you moved the "PatternLoop:" location to the line just before the "Loop" then you could remove the "If Text = "" Then Exit Do" line and it'll then check the value of Text at the "Do Until Text = """ line.


    Also, vbNullString is faster then testing for empty quotes.. and using Len() is even faster then that. In such big loops, I'd definitely recommend that you opt for maximum speed.


    It's also faster to use a variable to store function results that won't be changing between repeated calls to them.  For example, if you Dim a UB as long then instead of:
      ReDim Preserve Patterns(1 To UBound(Patterns) + 1)
      Patterns(UBound(Patterns)).Text = Char
      Patterns(UBound(Patterns)).TimesRepeated = PatternLen / CurPatternLen
      Patterns(UBound(Patterns)).Position = Len(Compress)
    You could do:
      UB = UBound(Patterns) + 1
      ReDim Preserve Patterns(1 To UB)
      Patterns(UB).Text = Char
      Patterns(UB).TimesRepeated = PaterrnLen / CurPatternLen
      Patterns(UB).Position = Len(Compress)
    This site has an article on how to test performance of things and what methods work best and that one was listed.. supposedly, if you use With, then it VB only needs to look for the location of the variable once, instead of with each reference to it.


    Well... keep up the good work and happy proggin!
      ~RaeVan

  8. 01 Jan 1999 at 00:00

    This thread is for discussions of String compression.

Leave a comment

Sign in or Join us (it's free).