As developers, we are used to see the difference between two (text) files via diff
, git diff
, or related tools. How to see the difference between two Excel files?
There are many options, and your choice would probably depend on what you mean by “difference”.
1) CSV + Meld
This is the easiest approach. Save both files as .csv
, and use diff
to compare them. Alternatively, you can use Meld
, which is basically a UI for diff
.
For big Excel files, or for two files with lots of differences, it’s usage is limited.
2) ExcelCompare
ExcelCompare is a sophisticated Java based tool with lots of options. I guess it’s most useful if you already have an idea what kind of difference to expect.
3) CSV + MySQL
This approach requires the most effort but comes with high flexibility.
First, save both Excel files in .csv
format.
Then, create a scratch database in MySQL:
> create database excel_diff_tmp;
Next, create two tables which match your Excel file’s tables:
CREATE TABLE `sheet1` (
id int(11) NOT NULL AUTO_INCREMENT,
column_a varchar(255) DEFAULT NULL,
column_b varchar(255) DEFAULT NULL,
# ...
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
Now, load both .csv
files into the database using LOAD DATA LOCAL INFILE
:
LOAD DATA LOCAL INFILE '/path/to/file1.csv'
INTO TABLE sheet1
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
You now have the full power of SQL at you hands to compare these two tables. You can do things like
- Show rows which are in file 1, but not in file 2 (using all columns)
- Show rows which are in file 1, but not in file 2 (using only some columns)
- Show rows equal in both files
- etc.
From MySQL Workbench, you could export the result back as a .csv
file, which you can finally convert to Excel format.