Ace:Comparing Collections
From Adapt
The ACE AM allows you to compare collections to an external manifest. This is useful when you want to compare sites that have replicated data, or to verify all data was successfully loaded during ingestion.
There are two ways to compare collection, first from an external data file containing digests and filenames. The second is to register peer ACE AM installations and automatically compare collections as part of an audit.
Manually Compare
Manually comparing collections is a two step process. First a source list of digests and filenames needs to be generated. The second step is to load that list into an AM installation for comparison. To manually compare collections between AM installations, you would download the digest list from the first installation, then upload the list to the second for comparison.
Step 1: Generate Digest List
The format of the list is a simple, SHA-256 digest, followed by one or more spaces or tabs, then the path to the file in the collection.
Example source list.
01348875911b94af38b35f304dcd75348f437734696e26b40fd868eecd687d35 /state/data/state/state-2007-10-ARC/state-20071001-aud-000000.arc.gz 365fe0af21237b750258f6b8c48b25964d0dd5c7d612748eff2f6526f43682bb /state/data/state/state-2007-10-ARC/state-20071001-aud-000001.arc.gz 893298b40da08a1b9ce0c7994c8f2717cedb23d046936c8e24bd62655ca1962b /state/data/state/state-2007-10-ARC/state-20071001-aud-000002.arc.gz ba7cce400971bd56377e2d79a21192c63e0328e7651728345c49ebf35fb4999d /state/data/state/state-2007-10-ARC/state-20071001-aud-000003.arc.gz 9827832cdfd4a9565422e41fd334eb09a23c835772184936ffebabb147eb5b8a /state/data/state/state-2007-10-ARC/state-20071001-aud-000004.arc.gz 3d55a5b19dd6133e598fb29ff89444fb05196863c21d8773a03dbe16c0b42615 /state/data/state/state-2007-10-ARC/state-20071001-aud-000005.arc.gz ea0880b33fb9b237299c7e92578f4881c820b93e3d130a5818e3b3a3e90b8872 /state/data/state/state-2007-10-ARC/state-20071001-aud-000006.arc.gz 45b632a3de7ca7c38a916242d78cedce6f11004cef99e4642194a48651db597f /state/data/state/state-2007-10-ARC/state-20071001-aud-000007.arc.gz bca0a7d6f78b9d46196bb502ef31782d0b3ea5a075ca61691bdd0a2ffc3cfd24 /state/data/state/state-2007-10-ARC/state-20071001-aud-000008.arc.gz 4ec209a01449552454b57d82e12c0848982010ebd7f36e4ac3206576819531cf /state/data/state/state-2007-10-ARC/state-20071001-aud-000009.arc.gz c4ed9102ba6e8f0ea5f9bfeb06318b3db2230733c5fd9a22b18405dcfe820a7f /state/data/state/state-2007-10-ARC/state-20071001-aud-000010.arc.gz b6f97b66eff760a3bbef4e7895be9432fa4dfcb206d42d789c5ecfb4343eadc1 /state/data/state/state-2007-10-ARC/state-20071001-aud-000011.arc.gz ab3f8d618e51032418a3285fd24687f7a2a006cd546de43e575c72b1fed727e4 /state/data/state/state-2007-10-ARC/state-20071001-aud-000012.arc.gz
Directories must be separated by a /. This is different from Windows where directories are separated by a \.
The Audit Manager is able to supply a list of digests for a collection or directory.
- From the status page, select the collection you wish to generate a list for
- Click on more..., then 'Download Digests'. You will see a list of digests and filenames in the correct format. You can right-click and save the list to your hard drive.
Step 2: Upload to ACE
From the status screen, select the collection you wish to compare. Click the 'more...' link to bring up the drop down menu, then click 'Compare Collection'. You will see the following screen. Click on 'Browse' and select the name of the file you saved your digest link into during step 1. Click submit.
If everything in the file you submitted matches the selected collection, you will see the following summary showing no differences.
If, for some reason there are differences, then your screen may look something like the following:
There are four different ways in which collections may differ.
- Files in original collection, but not in supplied
- This is a list of files that appear in the collection being monitored by ACE, but do not appear in the uploaded file.
- Files in supplied file, but not original collection
- These are files that appear in the uploaded list, but not in the collection monitored by ACE.
- Files with different names, but same digests
- These are files which have the same content, but only differ in directory or name. This is seen when files are renamed and moved across different operating systems.
- Files with same names, but different digests
- These are files that have the same directory and name, but have different content. Most likely seen during a bad replication.