Text-area & DIFF Text: Enter the text (left & right) into the text areas and
press
the button "DIFF Text" to compare them - or
Choose File & DIFF Files: Choose the files (left & right) using the file browser
and press the button "DIFF Files/Directories" to compare them (Attention: In the
online/web version the file-size is limited to 2*512 KB to avoid misuse of this service)
- or
Choose Directory & DIFF Directories: Choose the directories (left & right) using
the directory browser and press the button "DIFF Files/Directories" to compare them
(Attention: The comparison of directories is only supported in the standalone version -
in the web version you need to zip the directories first before comparing them!)
diff algorithm (default smart diff): "smart diff" or "character based diff" or
"line based diff":
"smart diff" (default): As the name says is a smart combination of the
algorithms "line based diff" and "characters based diff". First the line based
diff is applied and if the match is bad i.e. below a certain threshold (20% is
the default) then the character based diff is applied.
"character based diff": is a good choice for bad aligned text
i.e. text that have no common/similar line separation respective have different
formatting (tags) e.g. comparing html text with its clear text (copied from
browser). Or comparing books that are formatted differently.
"line based diff": Is a good choice, if the from and to input text are
line oriented (most programming languages) and in case we have a good amount of
lines that are matching. The non-matching lines are compared in a second run
using the character based diff. False positive line matches can have a bad
impact on the matching result, "tearing it apart". Most diff tools
(unfortunately) only support line based comparison. For example see the "meld"
diff tool in unix, where false-positive matches mess up the whole comparison.
max time (default 1000 milliseconds): The number of milliseconds the comparison
is
allowed to take before it is stopped. After this time limit is exceeded the algorithm
shows
in percentage how much characters could be compared and returns the remaining characters
as
non-matching.
file hierarchy: filtering (default true): Indicates whether the line- and
overall-filters should be applied on every single file when comparing entire
file-hierarchies (directories, zip-files, etc.). The default is set to "true" BUT be
aware that filtering every single file in big hierarchies can take quite long,
especially if we have overall filters set, because for overall filtering we need to read
and filter the whole file before we can compare. Line filters have no such big impact on
performance because we read, filter and compare line by line and stop when the first
line differs.
file: same size & same time: compare (default false): Indicates whether files
that have the same size and same last modification date should be compared. Basically if
this flag is set to true then every single file in the hierarchy is compared! The
default is set to "false" BECAUSE if both: "file-hierarchy-filtering" AND "same-size and
same-time" are set to true then this has BIG PERFORMANCE IMPACT ON LARGE FILE
HIERARCHIES!
diff output: merge matching sections (default true): Normally we we are only
interested in the non-matching sections to have them displayed next to each other.
Therefore we can/should set merge matching sections to true.
diff output: unified diff: take unchanged lines of original (default true):
Unified diff was not developed having filtering (and ignore empty lines) in mind.
Therefore due to filtering it can happen that the unchanged lines (aka contextual lines)
in the original and new file do not fully correspond. To circumnavigate this limitation
this option enables the user to choose whether to take the unchanged lines of the
original or the new file. Normally the unchanged lines should be taken of the original
file because the patch is applied to the original file. (If you want the unified diff
output to be fully accurate you should disable the overall-filtering and the
ignoring-of-empty-lines).
diff output: unified diff: number of unchanged lines (default 1):
As already mentioned previously the unified diff is sometimes fuzzy due to filtering.
Therefore the number of unchanged lines in the original and new file might not
correspond. To minimize this impact it is suggested to set the number of
unchanged/contextual lines to "1" - because one single line always corresponds.
(Normally
in unified diff the number of unchanged lines is set to "3").
style: font: size: diff input and output (default 10):
The font size that should be used for the diff in- and output.
line filter: pattern: enter a regular
expression pattern that should get applied for each line. For example to match the
leading and trailing spaces on a line use the following pattern: "(^\s+)|(\s+$)"
line filter: capturing groups: enter a comma separated list of numbers (0,1,...
or 1,3,... or 2 or ...) of capturing
groups you are interested in filtering. For example the pattern "(^\s+)|(\s+$)"
has three capturing groups: 0 (always matches the whole expression), 1 (matches "(^\s+)"
i.e. the leading spaces) and 2 (matches "(\s+$)" i.e. the trailing spaces)
line filter: capturing groups: action: select the action that should get applied
i.e. what should be done with the capturing groups respective the content matched by the
capturing groups: "remove" (the capturing group respective the content matched by this
group is removed / ignored for the diff) or "keep" (only the capturing group is
considered for the comparison)
overall filter: pattern: analogue to "line filter" but gets applied (after the
line filters have been applied) on the whole text respective file. For example a filter
that matches the JavaDoc comments would be: "/\*\*.*?\*/"
overall filter: capturing groups: analogue to "line filter: capturing groups"
diff option: ignore empty lines (default true): Mostly we can ignore empty lines
when comparing text (because they have no relevance besides formatting)
file: character set (default UTF-8): The character set that is used when reading
a file from disk. The default character set for the Linux platform is UTF-8. The default
character set for Windows is Windows-1252 or CP-1252 a character encoding of the Latin
alphabet which is almost identical to ISO-8859-1 except in the control characters range
80
to 9F (hex).
follow symbolic links (default true): Defines whether symbolic links to files and
directories should be followed during comparison or not. Default is set to true i.e.
symbolic links will be followed for comparison. (Attention: The symbolic link option is
only supported in the standalone version - because in the web version we are working on
single files or on zip-files and in this context symbolic links are not relevant i.e.
only in the standalone version when comparing entire directories the symbolic
link option is relevant.)
History:
NEW (08.2015): Microsoft Word (.doc & .docx) & Excel (.xls & .xslx) support:
Supports now as well the comparison of Microsoft Word (.doc & .docx) & Excel (.xls &
.xslx) files (file load based on Apache POI)!
NEW (09.2015): OpenOffice & LibreOffice Writer (.odt) support:
Supports now as well the comparison of OpenOffice- and LibreOffice-Writer (.odt) files!
NEW (10.2015): PDF (.pdf) support:
Supports now as well the comparison of PDF-files (file load based on Apache
PDFBox)!
NEW (11.2015): Linear Space Refinement: Improved quality & performance:
Implemented a variation of Eugene W. Myers linear space refinement algorithm and further
improved the diff quality and performance!
NEW (12.2015): AngularJS: Improved GUI:
Implemented the GUI in AngularJS for a better look & feel and user experience.
NEW (01.2016): File hierarchy: Directory & Zip-File Support:
Supports the comparison of entire file hierarchies: directories (only stand-alone
client)
and zip-files and outputs the differences in a tree-view.
NEW (02.2016): Unified Diff Support:
Supports unified diff. The unified diff output is often used as input to patch programs.
Many projects specifically request that "diffs" be submitted in the unified format,
making
unified diff format the most common format for exchange between software developers.
(See
as well: diff utility).
NEW (05.2016): Max-Time Support
(Option:
max time: Default
1000 milliseconds): The user can now set a max-time how long the comparison may take.
After
this time limit is exceeded the algorithm shows in percentage how much characters could
be
compared and returns the remaining characters as non-matching.
NEW (06.2016): Performance Improvement:
The reconciliation algorithm has further been improved and based on that the performance
and
diff results became even better.
NEW (07.2016): Lazy Load & Scrollbar Support:
Only the nodes/files that are different are loaded, shown and expanded in the
explorer. Further nodes are loaded on demand when clicking the corresponding folder icon
in
the explorer. When navigating from the diff-explorer- to the diff-file-view and back
again
the scrollbar-position in the explorer is maintained (making sure the user does not
loose
the orientation).
NEW (08.2016): Filter support on hierarchies:
(Option: "file hierarchy": filtering: Default: "yes"): When comparing entire file
hierarchies (directories, zip-files, ...) the line- and overall-filters can now be
applied
on every single file, which makes the file hierarchy diff result much more accurate i.e.
no
false positives because the same filtering is now applied on each file as when comparing
just single files.
NEW (08.2016): Character-Set/Encoding support:
(Option: "file: character set": Default: "UTF-8"): We can now explicitly set the
character
set/encoding of the left- and right-file before comparing them. This helps to avoid
differences caused by files that have been persisted in different encodings.
NEW (09.2016): Font Size: Diff In-/Output:
The user can now set the font size for the diff in- and output sections.
NEW (09.2016): On-the-fly Edit & Save support:
The user can edit the diff-content and while typing the diff result gets updated
on-the-fly.
And last but not least the user can then save the modified files and diff results.
NEW (01.2018): Copy & same-size/same-time support on hierarchies:
NEW Copy functionality to copy single files or entire sub-directories from left to right
and vice versa!
(Option: "file: same size & same time: compare: Default: "no"): When comparing entire
file hierarchies (directories, zip-files, ...) the same-time & same size comparison can
now be applied so that every single file will be compared, which makes the file
hierarchy diff result more accurate but potentially at a high performance impact
depending on the directory size and therefore set to false by default.
NEW (03.2018): Follow Symbolic Links:
The user can select whether symbolic links to files and directories should be followed
and considered during comparison or not!
diff algorithm:
max time (milliseconds):
file hierarchy: filtering:
file: same size & same time: compare:
diff output: merge matching sections:
diff output: unified diff: take unchanged lines of original:
diff output: unified diff: number of unchanged lines:
style: font: size: diff input and output:
from:
to:
line filter: pattern:
line filter: pattern:
line filter: capturing groups:
line filter: capturing groups:
line filter: capturing groups: action:
line filter: capturing groups: action:
overall filter: pattern:
overall filter: pattern:
overall filter: capturing groups:
overall filter: capturing groups:
overall filter: capturing groups: action:
overall filter: capturing groups: action:
diff option: ignore empty lines:
diff option: ignore empty lines:
file: character set:
file: character set:
follow symbolic links:
Security Question: Please enter the result of the following operation: {{securityQuestion.leftOperand}}
{{securityQuestion.rightOperand}} =
from:
to:
{{diffOutput.message}}
{{diffOutput.message}}
Time to load: {{diffOutput.timeToLoad}} ms; Time to
compare: {{diffOutput.timeToCompare}} ms; Time to write: {{diffOutput.timeToWrite}} ms
Tree Navigation: To navigate use: Ctrl+Shift+Cursor-Keys
(←↑ →↓) and Ctrl+Shift+Enter (↲) to compare the
selected node (and hide or show the tree). Operations: To copy use: Ctrl+Shift+C.