Tangle 3.2 MS-DOS file encryption program (Turbo C source)

Alan Grant Finlay alanf at bruce.cs.monash.OZ.AU
Fri Jun 7 20:15:01 AEST 1991


This is the Turbo-C source distribution of tangle 3.2, a file
encryption program for MS-DOS to run on IBM-PC compatibles.

Files are separated and preceded with ---------<file name>---------------

The file read.me refers to some files which are only required for an
executable distribution, they can easily be manufactured.

--------------------<read.me>-----------------------
This file accompanies a set of  programs  to  support  a  personal  encryption
system  for  use  with  IBM-PC  compatibles  running  MS-DOS.  The encryption/
decryption process is symmetric and takes only a few seconds for  one  or  two
pages of text.  Very large files take a significant length of time to convert.
The source files for "tangle.com" and "untangle.com" are "tangle.c"  with  the
ENCRYPT  constant  set  to  1 or 0 respectively.  Since not all users may have
access to the Turbo-C compiler, executable versions are  also  supplied.   The
file  "tangle.doc"  is a user manual.  A description of the algorithm is given
in "algthm.txt", and the author's address in "author.txt".  The file "dir.cpt"
is a directory of the distribution disk (without itself listed) encrypted with
the password "abracadabra".   You can untangle this file and check file sizes.
To use the programs type "tangle src dst"  or  "untangle src dst" where src is 
source file and dst is the destination file.
----------------------<tangle.doc>---------------------
                         User manual for Tangle 3.2
                         --------------------------

This program supercedes Crypt 2.0, a name change has  been  adopted  to  avoid
confusion with the Unix password encryption function (one way trapdoor).  This
program does not support the encryption algorithm of  Crypt  2.0  however  the
name  change should make it somewhat easier to introduce version 3 without too
much confusion and gradually convert the encrypted files.

The source code "tangle.c" can be  used  to  produce  two   executable   files
which  are   assumed   to  be  called  "tangle"  and "untangle" by the program
itself.  The source is in Turbo C (Trademark) and can be  compiled  with   the
tiny   memory model.   Since  use  is  made  of system dependent functions for
obtaining the input file size for tangle, the program is not portable as is.

To use tangle and untangle simply type the command name followed by the source
and  destination  file  names.  You will be prompted for a password and status
information is produced.  The password is  echoed  since  for  the  length  of
password  intended it would be too easy to mess it up.  The password should be
a long sequence of characters such as a sentence with some  strange  words  or
characters  in  it.  The maximum length is the same as the maximum block array
width (100 as supplied).  The password is scrolled off the  screen  after  the
line  is  terminated  and  an  error  checking  code  is displayed.  The error
checking code number reveals some  information  about  the  password  but  not
enough to worry about.  The purpose of the error checking code is to alert the
user to an incorrectly typed password, since it is a function of the  password
it should always be the same when the same password is used.

The algorithm used was inspired by the Rubik's cube.  The core algorithm works
as follows:

   First a fixed block size is determined which is the product of two numbers
   "wide" and "high".  The data is processed in a two dimensional array with
   these dimensions one block at a time.  If the last block cannot be entirely
   filled it is padded with junk characters.  The password is used to
   determine a key on a column per character basis.  The key value for each
   column (modulo the height) is the amount this column is shifted down.  The
   part which emerges from the bottom is inserted into the top (i.e. a cyclic
   shift).  The rows are similarly shifted but using the values 1,2,3,...,etc.
   This two stage process which is called a shuffle is repeated a fixed number
   of times.  The result is a permutation of the array determined by the
   password alone.

This algorithm produces a pure permutation of the data which has the advantage
of  not  changing  the character set but weaknesses that cannot be ignored for
any serious use.  A number of modifications to the algorithm produce one which
in  the  author's opinion is secure against a chosen plaintext attack and if a
large enough good password is used then immune to brute force attack  for  the
conceivable future.

The first modification is to change the nature of the algorithm so that it  is
no  longer  a pure permutation.  The strategy adopted is to include a phase in
each shuffle which replaces each second line by the  bitwise  exclusive-or  of
itself  and  the  line  before.  This means that the history of the encryption
process, which characters happen to  be  located  above  each  other  in  each
shuffle, affects the final outcome.  An analogy is that of shuffling a pack of
cards on which the ink is still wet as compared to  a  normal  shuffle.   With
this  modification  the  algorithm  is  strong against known plaintext attack.
Secure that is provided the known text is not a simple pattern (e.g. all space
characters).   To secure against simple patterned source a simple substitution
of the block is performed after the first  shuffle.   This  eliminates  simple
patterns  and  also  prevents  a  simple  chosen plaintext attack strategy the
author discovered which applies to Crypt 2.0 (Tangle's predecessor).

If the input file is smaller than 450 bytes it is padded to  this  size.   The
block  size  is  determined  according  to  the size of the input file and the
minimum size.  For files smaller than 10 Kbyte there will be only  one  block.
Each  block  has two dimensions - wide and high.  The last block usually needs
some padding which is chosen from whatever happens to be in the array already.
The  junk  is  selected  randomly  though the random number generator has only
about 30,000 possible seeds. The password or key is a string supplied  by  the
user,  this  is extended to the maximum length (100) by repetition however the
repeated strings are modified  by  exclusive-or  '152'  to  help  extract  the
maximum  information  from the supplied string (since later only the remainder
mod high will be utilised).

The security of the system is highly dependent on the choice of  password  and
particularly  its  length.   Passwords  which  are  too  short  are  rejected.
Naturally  the  password  is  not   recorded  permanently   anywhere   by  the
encryption  program  and  version  3.0  erases the password from memory before
exit.  The easiest way to choose a good password is to use  a  whole  sentence
(spaces and punctuation are  allowed) with  some  strange  characters  in  it.
It is relatively easy for a cryptanalyst to try out all short  sentences  with
words  from  a  standard  dictionary.  Brute  force  cracking of an  encrypted
file with  n blocks of block size b, a password of p units chosen from  a  set
of  q  values  and  with  s  shuffles  per  block  requires  O(s  * b * (q^p))
operations.  Obviously increasing q and/or p pays great dividends.  For a pure
permutation  it  would be possible to try all b!  rearrangements of  a  block.
This  is ruled out by the minimum block size being 450.  To check all possible
keys  with  the  minimum block dimensions of 32*15 (width*height) and with the
minimum password length of 10 characters we need to  consider  15^10  possible
keys.   15^10  possibilities would take a week to check at the rate of one per
microsecond.  Actually the number of keys is greater than this (note  how  the
password  is extended above) so it would take longer.  Using a password length
of 15 characters the time would be  increased  to  over  ten  thousand  years.
Longer  passwords are supported by the program since these calculations assume
an unpredictable password is used.  Even if the  password  is  chosen  from  a
dictionary of 1000 four letter words the program would allow 6 such words with
the minimum block size corresponding to 1000^6  keys  which  is  approximately
15^15.   It  is  clear that increasing s is not an effective way of  improving
security against brute force attack.  The time taken by the algorithm is O(s *
b  *  n).   Since  (b*n)  cannot  be changed s should be chosen as small as is
considered safe.  Note that s is  a  compile  time  constant  for  the  Tangle
program and may be changed (it is called SECURE).

There is another security issue to consider.  Even when a cyphertext cannot be
deciphered  it  may  reveal  information  about  a  sequence  of  cyphertexts.
Consider the encryption of two files differing by  only  one  character.   The
difference  in  the  input files causes a cascade of differences in the output
files due to the exclusive-or in  the  shuffle  operation.   This  cascade  is
approximately  1.5 to the power of the number of shuffles when the differences
are sparse hence ideally the value chosen for SECURE should be  at  least  the
logarithm base 1.5 of the block size.  The block size varies from 450 to 10000.
Hence a conservative choice for SECURE would be 2*LOG(100)/LOG(1.5)=23.  Using
the minimum block size of 450, SECURE=15 would be sufficient.  This should not
be taken too seriously however since if the files require more than one  block
then  the identical input blocks (except most likely the last block due to the
padding) will be encrypted identically.  The value chosen for SECURE in Tangle
is  10  which  is  a  compromise.

The security of the above algorithm to detection of similar  sources  is  also
compromised  by  the  fact  that  so far the algorithm encrypts each bit plane
(corresponding to the 8 bits in a byte) independently.  This is a weakness  of
version  2  which  has  been  corrected  in  version  3.0.   A  one  character
difference will be a one bit difference in some planes but not very likely for
all  planes.  To fix this the exclusive-or phase of the shuffle is modified in
version 3 to add 1 mod 256 to the byte.  This means  the  bit  planes  are  no
longer independent and each entire block is mixed together.  It would have been
preferable to use a cyclic shift instead of adding 1 mod 256 but the C language
does not have this operation.  Because a carry into the 8th bit position will
only occur on average one time in 128, SECURE must be chosen large enough to
cause a cascade of changes in the 8th plane.

If similar source produces similar output then it seems likely a chosen 
plaintext attack could be mounted.  Hence for systems requiring this security
SECURE should be chosen as large as the conservative estimates above (15 to 23,
according to the block size).  Since the intended domain of application to
Tangle is personal computer file security, the more modest default value of
SECURE=10 should suffice.  Where protection against casual snooping is all that
is required then SECURE=3 would be sufficient.  In this case it would be a good
idea to remove the redundant phases of the last shuffle.  Of the three phases
of a shuffle, only the first depends upon the key.  This means the final two
phases of the last shuffle are redundant and may be omitted.


Version 3 also incorporates a change recommended by Russel Herman that using a
floating  point  library  for  a  single  use  of the square root function was
unwarranted.  Version 3 uses a simple linear search  square  root  calculation
and this cuts the size of the executable code by about 50% - thanks Herman!

Some final operational advice.  Confidential documents should be created using
a  ram  disk or a floppy which will be destroyed.  Note that deleting and even
overwriting magnetic media does not prevent the data's  recovery.   Watch  out
for  editors  which keep temporary copies of files which they are working upon
in places of their own choice.  A  deleted  temporary  file  is  often  easily
recovered and you may not even be aware of its existence.

(The algorithm is the same for all versions 3.x)

Version 3.2: The only changes are the removal of a superfluous include and 
             alterations to comments in the source.


--------------------------------<tangle.c>---------------
/***********************************************************************/
/* (c) Copyright 1989, 1990, Alan Finlay.      Tangle, Version 3.2 .    */
/* This program may be freely distributed but may not be sold or         */
/* marketed in any form without the permission and written consent of     */
/* the author.  I retain all copyrights to this program, in either the    */
/* original or modified forms, and no violation, deletion, or change of   */
/* the copyright notice is allowed.  I will accept no responsibility for  */
/* any loss, damage, or compromised data caused directly or indirectly by */
/* this program.  It is released on an "as is" basis with no warranty    */
/* as to its being fit or suitable for encrypting any form of data.     */
/***********************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <sys\stat.h>

#define ENCRYPT 1        /* Choose tangle (1) or untangle (0) */
#define DEBUG 0          /* Show page after each shuffle if non zero */
#define TRUE -1
#define FALSE 0
#define LIMIT 100        /* Maximum block size is LIMIT*LIMIT */
#define SECURE 10        /* The number of block transformations */
#define MINB 450         /* Minimum block size - insecure if too small */

typedef unsigned char line[LIMIT];

char copyright[40] = "(c) copyright 1989,1990, Alan Finlay";

unpat(page,wide,high) /* Simple substitution to eliminate simple patterns */
   line page[LIMIT];  /* [width,height] */
   int wide,high;
   {
   int i,j,k;
   k = 0;
   for (i=0;i<wide;i++) for (j=0;j<high;j++) {
      k = (k+7)%256;
      page[i][j] = page[i][j] ^ k;
      }
   }

#if ENCRYPT
shuffle(page1,code,wide,high)
   line page1[LIMIT];  /* [width,height] */
   line code;
   int wide,high;
   {
   int i,j,k,key,shift;
   line *mix1,*mix2;
   line *oldline,page2[LIMIT];  /* [height,width] */
   for (k=0;k<SECURE;k++) {
#if DEBUG
      show(page1,wide,high);
#endif
      /* Shift columns */
      for (i=0;i<wide;i++) {
         oldline = page1[i];
         key = (int) code[i];
         for (j=0;j<high;j++) page2[j][i] =(*oldline)[(j+key)%high];
         }
      /* Mixup */
         for (j=1;j<high;j+=2) {
         mix1 = page2[j-1];
         mix2 = page2[j];
         for (i=0;i<wide;i++) (*mix2)[i] = ((*mix2)[i]^(*mix1)[i])+1;
               /* Assume overflow ignored so 255+1==0 */
         }
      /* Shift rows */
      for (j=0;j<high;j++) {
         oldline = page2[j];
         shift = (j%(wide-1))+1;
         for (i=0;i<wide;i++) page1[i][j] = (*oldline)[(i+shift)%wide];
         }
      /* Eliminate any pattern (after first iteration only) */
      if (k==0) unpat(page1,wide,high);
      }
   }

#else
unshuffle(page1,code,wide,high)
   line page1[LIMIT];  /* [width,height] */
   line code;
   int wide,high;
   {
   int i,j,k,key,shift;
   line *mix1,*mix2;
   line *newline,page2[LIMIT];  /* [height,width] */
   for (k=0;k<SECURE;k++) {
#if DEBUG
      show(page1,wide,high);
#endif
      /* Eliminate any pattern (before last iteration only) */
      if (k==SECURE-1) unpat(page1,wide,high);
      /* Shift rows back */
      for (j=0;j<high;j++) {
         newline = page2[j];
         shift = wide-(j%(wide-1))-1;
         for (i=0;i<wide;i++) (*newline)[i] = page1[(i+shift)%wide][j];
         }
      /* Reverse mixup */
      for (j=1;j<high;j+=2) {
         mix1 = page2[j-1];
         mix2 = page2[j];
         for (i=0;i<wide;i++) (*mix2)[i] = ((*mix2)[i]-1)^(*mix1)[i];
                /*  Assume underflow is ignored so 0-1==255 */
         }
      /* Shift columns back */
      for (i=0;i<wide;i++) {
         newline = page1[i];
         key = (int) code[i];
         for (j=0;j<high;j++) (*newline)[(j+key)%high] = page2[j][i];
         }
      }
   }
#endif

show(page,wide,high)
   line page[LIMIT];
   int wide,high;
   {
   int i,j;
   puts("\n");
   for (j=0;j<high;j++) {
      putc('\n',stdout);
      for (i=0;i<wide;i++) {
         if (page[i][j]<30) putc('*',stdout);
         else putc(page[i][j],stdout);
         }
      }
   }

main (argc,argv)
   int argc;
   char *argv[];
{
   FILE *infile,*outfile;
   int wide,high,i,j,k;    /* Block width and height, loop counters */
   int blkn = 1;           /* Block counter */
   int clen;        /* Password code length */
   long chksum;     /* Password checksum */
   int ch = 0;
   int invers;             /* Version of input file for decrypt */
   int vers = 3;           /* Version of this program */
   line page[LIMIT],code;
#if ENCRYPT
   int chrcnt;       /* Character counter */
   long fsize;       /* Input file size */
   int blocksize,nblocks;
   struct time t;  /* For system time */
   struct stat st; /* For input file stats */
   /* Randomise the rand() function */
   gettime(&t);
   srand(t.ti_min*400+t.ti_sec*100+t.ti_hund); /* random seed <30000 */
   /* Check the input arguments */
   if (argc!=3) {puts("\nUsage is: tangle src dst\n"); exit(1);}
#else
   int blkcnt;
   /* Check the input arguments */
   if (argc!=3) {puts("\nUsage is: untangle src dst\n"); exit(1);}
#endif
   if ((infile = fopen(argv[1],"rb")) == NULL) {
      printf("\n%s",argv[1]); perror(" "); exit(1);}
   if ((outfile = fopen(argv[2],"wb")) == NULL) {
      printf("\n%s",argv[2]); perror(" "); exit(1);}
#if ENCRYPT
   /* Get input file size */
   if (stat(argv[1],&st)!=0) {perror(" "); exit(1);}
   fsize = st.st_size;
   printf("The input file size is %ld\n",fsize);
   /* Choose block size accordingly */
   if (fsize<(LIMIT*LIMIT)) blocksize = (int) fsize;
   else {
      nblocks = (int) (fsize/(LIMIT*LIMIT)+1);
      blocksize = (int) (fsize/nblocks+1);
      }
   if (fsize<MINB) blocksize = MINB;    /* Minimum block size enforced */
   wide = 0; while (wide*wide<blocksize) wide++;  /* Approx square root */
   wide = wide+10; if (wide>LIMIT) wide = LIMIT;
   high = blocksize/wide+1; if (high>LIMIT) high = LIMIT;
   while (1) {
      blocksize = wide*high;
      if (fsize<(long) blocksize) break;
      else {
         /* Multiple blocks, check for last block too small */
         if (((fsize-1)%blocksize)>(blocksize*3/4)) break;
         /* (fsize-1) is used above so perfect fit is accepted! */
         high--; wide--; /* Try a smaller block */
         }
      if (wide<50) break;
      }
   printf("The width and height are (%d,%d)\n",wide,high);
   printf("The last block is %ld bytes\n",((fsize-1)%blocksize)+1);
   fprintf(outfile,"%d,%d,%d,",vers,wide,high);
#else
   fscanf(infile,"%d,%d,%d,",&invers,&wide,&high);
   if (invers!=vers) {
      printf("This is version %d of the encryption program.\n",vers);
      printf("The input file is for program version %d or invalid.\n",invers);
      exit(1);
      }
#endif
   /* Get password */
   while(1) {
      puts("\nPlease enter your password");
      fgets(code,LIMIT,stdin);
      clen = strlen(code);
      if (clen>9) break;
      puts("Insecure password, try a longer one.");
      puts("For security do not use a name or word in any dictionary.");
      puts("For example use something like \"Dazed and Konfuzed\"");
      }
   for (i=0;i<25;i++) puts(" ");   /* Clear the screen */
   if (clen>wide) puts("Warning: tail of password ignored");
   /* Extend password to possible limit, not null terminated */
   for (i=clen;i<LIMIT;i++) code[i] = code[i%clen] ^ '\152';
   /* Generate a checksum for the characters */
   for (chksum=0,i=0;i<clen;i++) chksum += (int) code[i]*i;
   printf("The password checksum is %ld.  Please wait ...\n",chksum % 1000);
   do { /* tangle or untangle a block */
#if ENCRYPT
      chrcnt = 0;
#else
      if (fscanf(infile,"%d,",&blkcnt)==EOF) goto NOBLOCK;
#endif
      for (j=0;j<high;j++) {
         for (i=0;i<wide;i++) {
            if ((ch = getc(infile)) != EOF) {
               page[i][j] = ch;
#if ENCRYPT
               chrcnt++;}
            else if (i==0 && j==0) goto NOBLOCK; /* EOF at start of block! */
            /* Pad the last block with existing junk */
            else page[i][j] = page[rand()%wide][rand()%high];
#else
               ;}
            else {puts("Error: unexpected end of file"); goto NOBLOCK;}
#endif
            }
         }
#if ENCRYPT
      fprintf(outfile,"%d,",chrcnt);
      shuffle(page,code,wide,high);
      for (j=0;j<high;j++) for (i=0;i<wide;i++) putc(page[i][j],outfile);
#else
      unshuffle(page,code,wide,high);
      for (j=0;j<high;j++) for (i=0;i<wide;i++)
         if ((j*wide+i)<blkcnt) putc(page[i][j],outfile);
#endif
      printf("Finished block number %d\n",blkn++);
      }
   while (ch != EOF);
NOBLOCK:                  /* Jump here to avoid writing an empty block */
   for (i=0;i<LIMIT;i++) code[i] = ' ';   /* Rubout the password before exit */
   fclose(infile);
   fclose(outfile);
}
------------------------------<author.txt>-------------------
Alan Finlay
Computer Science Dept.
Monash University
CLAYTON VIC 3168
Australia

-------------------------
email: alanf at bruce.oz.au
N.B. I wrote this program on my own machine in my own time, hence it does
not belong to Monash University.
--------------------------------<algthm.txt>-----------------
For each block, let us call it page[x,y] where 0<=x<wide and 0<=y<high:
(it is understood that each line is for all x<wide and for all y<high)
-----------------------------------------------------------------------
   1) Do one shuffle:
         page[x,y] := page[x,(y+key[x]) mod high]
         if y is odd then page[x,y] := (page[x,y] XOR page[x,y-1])+1 mod 256
         page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y]

   2) Perform a simple substitution:
         page[x,y] := page[x,y] XOR ((x*high+y)*7 mod 256)

   3) Do 9 shuffles:
         page[x,y] := page[x,(y+key[x]) mod high]
         if y is odd then page[x,y] := (page[x,y] XOR page[x,y-1])+1 mod 256
         page[x,y] := page[(x+(y mod (wide-1))+1) mod wide,y]

Each step in the process is easily reversible.
-------------------------------------------------------------------------

The encrypted file format is:
        1) Three decimal integers: program version number (= 3), wide, high.
           Each as chars terminated by a comma (i.e. C format "%d,%d,%d,").
        2) for each block:
              The real number of characters in the block as a comma
              terminated decimal integer (i.e. C format "%d,").
              The block of bytes:
                 for (j:=0..high-1), for (i:=0..wide-1) page[i,j]

-------------------------------------------------------------------------
---------------- end of Tangle 3.2 source distribution ------------------



More information about the Alt.sources mailing list