Troubleshooting Data Sets in tDAR

Over the past few years we've seen a lot of data sets go into tDAR. In that time, we've learned a great deal about what makes a data set ready for archiving, and conversely, what some of the common problem-spots are. Within tDAR, we try to catch many of these errors and provide users with warnings. Below are a few tips to avoid the most common issues we see when archiving a CSV, TAB or Excel data set.

Best Practices

Add Column Names

Verify that each data table in your data set has column names. Column names are critical to proper archiving. Without column names, it becomes difficult to understand what data or type of data is in each column. Make sure the column name is descriptive and clear.

Example: Data set without any column names 

2495chert
25071chert
26016chert
2977chert
2991chert
2821chert

Example: Data set with column names

LocusQuantityMaterial
2495chert
25071chert
26016chert
2977chert
2991chert
2821chert

Add Column Descriptions

Add column descriptions to make the data easier to understand. Even better, once you've uploaded a data set to tDAR, add descriptions to your column identifying counts, how the data was collected, or any notes or issues.

Example:

Verify Column Formatting

If you format columns as numbers or currency, double check that every row matches that format. To check the formatting, highlight the cell, column, or cells and choose "Format > Cells" and check:

Example: Numeric column with textual data

 

Check for Errors in Calculations or Functions

Review your data workbooks prior to uploading to check for errors in calculations or functions. tDAR will also check your data for errors where possible and try and warn you.

Example: Workbook with Excel errors

Upload and Apply Coding Sheets to Unlock Coded Data

Upload and apply coding sheets in tDAR to unlock coded data. When collecting data, we often use coding sheets to simplify and speed up data entry.  Once collected, however, it's important to capture the coded-values and coding sheets within tDAR as well, otherwise, it can become difficult to understand what the data represents. For example, in the table below, what is Material "B"? It could be "Bead" or "Bone" or something entirely different.

Example: Workbook with coded data

Check for Stray Data

Verify that you don't have stray data (check columns or comments) in your columns.Often when working with a data set, users will add a check columnto total up a row to check a value or add a comment. Make sure that these columns have been removed, or clearly labeled with a column name.

Example: Workbook unlabeled comments

Example: Workbook check-column

Verify Data is Properly Loaded

Check that your data has been loaded into Excel properly. Often, Excel is not the "starting point" for data. Instead data has been imported from another source; this may be a "tab" or character-delimited file, or a plain text file. Sometimes, Excel may not load the data properly and split data into multiple columns, or combine it into a single column. It's important to check every row to make sure that the data has been imported properly.

Example: Workbook with data-conversion errors (note level data)


Common Issues / Problems

Merged Column Headers

Avoid merged column headers. Make sure that each column has its own unique header column. For example: the material column should be two columns.

Example: Data set with column names and merged column headers (incorrect)

LocusQuantityMaterial
2495stonewarepottery
25071earthenwarepottery
26016coiledpottery
2977earthenwarepottery
2991stonewarepottery
2821stonewarepottery

Example: Data set with column names and separate column headers (corrected)

LocusQuantityMaterial (detailed)Material (General)
2495stonewarepottery
25071earthenwarepottery
26016coiledpottery
2977earthenwarepottery
2991stonewarepottery
2821stonewarepottery

Multiple Data Tables

Don't include multiple data tables within the same excel worksheet. tDAR treats each workbook as its own data table.  Adding multiple within the same workbook prevents the system from properly archiving or documenting the data within the columns.

Example 5: Workbook with two tables (incorrect)

Data as Images or Graphics

Don't include data as images or graphics in Excel Workbooks. tDAR cannot read the image contents. The row data will not be a "valid" data set.  In most cases, data like this should be made into a PDF and uploaded as a document.

Example: Workbook with image data

 

Example: Workbook with graphics and textual comments descriptions