← All posts

PROC COMPARE Without SAS: Dataset Comparison in StatDataViewer

By Kenneth Yan • March 28, 2026 • 7 min read

Dataset comparison is one of the most common tasks in clinical programming: compare the current delivery of ADSL against the previous one, verify that a dataset re-run produced identical results, or check whether a data amendment touched anything unexpected. The standard SAS tool for this is PROC COMPARE — powerful, but it requires a SAS session, a log to read, and a bit of boilerplate code every time.

StatDataViewer's Compare tool provides the same output interactively, without SAS. This post explains how to use it effectively and where it differs from PROC COMPARE.

What PROC COMPARE does — and what we're replicating

PROC COMPARE matches rows between two datasets using one or more ID variables, then reports:

Rows in the base dataset but not the compare dataset (and vice versa).
For matched rows: which variables have different values, and what those values are.

That is exactly what StatDataViewer's Compare tool produces — the key difference is that the results are presented as an interactive table you can sort, filter, and drill into, rather than a text log.

Setting up a comparison

Open Tools → Compare Datasets. The dialog has four main inputs:

1. Base dataset

Select from any open library. This is your reference — typically the previous version, the production dataset, or the expected output.

2. Compare dataset

Select the dataset you want to compare against the base. This is typically the new delivery, the re-run output, or the amended version.

The base and compare datasets do not need to be in the same library. Add PREV_DELIVERY and CURRENT_DELIVERY as separate libraries and select across them.

3. ID variables

Specify the variables that uniquely identify a row. For most ADaM datasets this would be STUDYID USUBJID or STUDYID USUBJID ASEQ. For SDTM: STUDYID USUBJID AESEQ for AE, or STUDYID USUBJID LBTESTCD LBDTC for LB.

Rows are matched across datasets using the ID variables. Unmatched rows appear in the "In base only" and "In compare only" sections of the results.

4. Comparison variables (optional)

Leave blank to compare all common variables. Specify a subset to limit the comparison to particular columns — useful for a targeted review of a single amended variable.

Reading the results

After clicking Compare, the results appear in three panels:

Matching rows

A summary count of rows successfully matched by ID. High match rates mean the datasets align structurally; a low rate usually means the ID variables are wrong or the dataset was restructured.

In base only / In compare only

Rows that appear in one dataset but not the other. These are shown as a separate grid you can scroll through. Common causes:

Subjects added or dropped between deliveries.
New AE or CM records in an amendment.
ID variable discrepancies (e.g. trailing spaces, case differences).

Value differences

Cell-level differences for matched rows. Each row in this grid shows:

The variable name.
The row identifier (your ID variable values).
The value in the base dataset.
The value in the compare dataset.

You can sort by variable name to group all changes to a single variable, or sort by row ID to see all changes for a particular subject.

Filtering before comparing

Apply a dataset filter to either the base or compare dataset before running Compare. This is useful for a scoped review — for example, compare only the adverse events for treatment arm A, or compare only a specific visit.

Filter with D, confirm the filter in the status bar, then open Tools → Compare Datasets. The comparison will use only the filtered rows.

How it differs from PROC COMPARE

Feature	PROC COMPARE	StatDataViewer Compare
Requires SAS license	Yes	No
Output format	Log / ODS output dataset	Interactive table (sortable, filterable)
Numeric tolerance (CRITERION=)	Yes	Exact match only (no tolerance setting yet)
Cross-library comparison	Requires LIBNAME statements	Point-and-click
Filter before compare	WHERE= option in dataset options	Apply dataset filter, then run Compare
Time to run	Write code → submit → read log	Dialog → Click → Instant results

For most day-to-day comparison tasks in clinical programming, StatDataViewer's Compare tool is faster. PROC COMPARE remains the better choice when you need numeric tolerance, custom output datasets, or integration with a validation pipeline.

Try it now

The Compare tool is available in the free version — no license required. See the Compare datasets documentation for the complete reference, or try StatDataViewer in your browser with sample data right now.

Download StatDataViewer — it's free →