This directory contains the GNU DIFF and DIFF3 utilities, version 1.14.
See file COPYING for copying conditions.  To compile and install on
system V, you must edit the makefile according to comments therein.

Report bugs to bug-gnu-utils@prep.ai.mit.edu
 
This version of diff provides all the features of BSD's diff.
It has these additional features:

   -a	Always treat files as text and compare them line-by-line,
	even if they do not appear to be ASCII.

   -B	ignore changes that just insert or delete blank lines.

   -C #
	request -c format and specify number of context lines.

   -F regexp
	in context format, for each unit of differences, show some of
	the last preceding line that matches the specified regexp.

   -H	use heuristics to speed handling of large files that
	have numerous scattered small changes.  The algorithm becomes
        asymptotically linear for such files!
	
   -I regexp
	ignore changes that just insert or delete lines that
	match the specified regexp.

   -N	in directory comparison, if a file is found in only one directory,
	treat it as present but empty in the other directory.

   -p	equivalent to -c -F'^[_a-zA-Z]'.  This is useful for C code
	because it shows which function each change is in.

   -T	print a tab rather than a space before the text of a line
	in normal or context format.  This causes the alignment
	of tabs in the line to look normal.


GNU DIFF was written by Mike Haertel, David Hayes, Richard Stallman
and Len Tower.  The basic algorithm is described in: "An O(ND)
Difference Algorithm and its Variations", Eugene Myers, Algorithmica
Vol. 1 No. 2, 1986, p 251.


Suggested projects for improving GNU DIFF:

* Handle very large files by not keeping the entire text in core.

One way to do this is to scan the files sequentally to compute hash
codes of the lines and put the lines in equivalence classes based only
on hash code.  Then compare the files normally.  This will produce
some false matches.

Then scan the two files sequentially again, checking each match to see
whether it is real.  When a match is not real, mark both the
"matching" lines as changed.  Then build an edit script as usual.

The output routines would have to be changed to scan the files
sequentially looking for the text to print.