Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Coding Sheets in tDAR

The set of lines that decodes all of the values for a given column of your spreadsheet or database represents a "Coding Sheet for that column. Each column that is a coded integer, real, or string needs to have a meaningful value associated with the code in order to make the spreadsheet useful.  Creating this association is the purpose of the coding sheet.  You will need a separate coding sheet for each coded column in the spreadsheet.  You use New in the main menu to create these coding sheets in tDAR; a coding sheet would usually reside within the same project that contains your database or spreadsheet). When you create a coding sheet in tDAR, you need to give it a title.  Use something that identifies your project and the variable being coded as when you need it you will only see the title.  For example:

Heshotauthla Fauna Coding Sheet - Taxon
Heshotauthla Ceramic Coding Sheet - Paste

There are two ways to get the coding sheet content into tDAR's coding sheet resource. As you create the coding sheet resources in tDAR,the Submit As box allows you to choose whether to upload a file from your computer (as described above), or cut and paste the lines that represent the coding sheet into a text box on the coding sheet entry. Depending upon whether you choose to upload a file or enter the information in a text box, tDAR will allow you to browse to locate the file or will give you the text box to type (or cut and paste) into. Unless it is a very simple coding sheet you will want to maintain it on your computer rather than just type it in, in case you need to change it later.

In tDAR a coding sheet has as code (e.g., 107), a value (e.g., "St. Johns Black-on-red"), and optionally a description (e.g., "matte, subglaze and glaze varieties included").  For a given coding sheet, the codes can either be all integers or they will be treated as strings. Leading zeros on integer codes are ignored, so 001 and 01 and 1 are interpreted the same (and appear as 1). Codes can also be standardized string values, including strings of numbers that include decimal points.  Thus, Flr and Fil could be context codes and 01.1, 1.1, and 1.2.1 could be also all be codes that would be expressed as strings.  However, string codes must match exactly.  Thus, with a string code flr and Flr are different and 01.1 and 1.1 are different. The values and optional descriptions can have embedded spaces and can include most special characters (this is not true for ontologies, but never mind that for now). The order of the lines (and hence the values) within the coding sheet doesn't matter. The codes need not be in numerical or alphabetical order.

If you are creating a coding sheet from scratch, the easiest thing to do is to create an excel spreadsheet with only these three columns, in this order, no headings, and nothing aftre the coding sheet information.  Save it as an xls file (not a Excel 2007 xlsx file).  You can then upload it directly to tDAR.

Alternatively you can create a text file in csv (comma separated value) format.  You can do this from scratch or can edit the information in an existing coding key document. 

In a tDAR csv format coding sheet each line has a code followed by a comma, followed by is value, followed optionally by a description (e.g. for how this value is distinguished). For example,

1,Male
2,Female
9,Indeterminate,No sex distinctive characteristics

Is a perfectly good coding sheet consisting for three line for the column SEX.

M,Male
F,Female
?,Indeterminate,No sex distinctive characteristics

The values and optional descriptions in a csv file cannot have double quotes within them (if you must they must appear as two quotes in a row, i.e.,"") and if they include a comma then the whole value needs to be enclosed in double quotes.

Converting a word processed or pdf coding key into a set of tDAR CSV coding sheets

If you have a coding key already typed up, you will want to convert that into csv format to avoid tedious retyping. The goal is to get from the text of a formatted document that describes a coding system into tDAR coding sheets.

Coding Key in a Spreadsheet

If your coding key sheet was created in Excel or another spreadsheet, you need to arrange it so that the codes are in the first column (A), their values in the second (B), and more elaborate descriptions are optionally in the third column. There should be nothing to the right of the third column. If you have multiple coding sheets in one spreadsheet you can select the rows that correspond to your coding sheet and say "Save As" and select "CSV" format and name and save the file in an appropriate folder on your computer. If there is just one coding sheet in the spreadsheet, be sure to eliminate any blank lines, titles or text other than the coding sheet information and Save As in CSV format without selecting rows. (When you Save As into csv in Excel you will get a message about losing features, just say OK. )

If your coding key is an Adobe Acrobat (pdf) or word processed document, conversion is a bit more work but for a long coding key it will probably be worth it. You can't save a either a word processed (e.g. Word) or pdf file directly to CSV.

Coding Key in a Word Processor Document

What you want to do with a word processed file is to open the document and use the word processor's search and replace function to help edit the document into a series of sets of lines each of which represents a coding sheet in csv format. For example, you will want to convert tabs to spaces, convert multiple spaces to single spaces, eliminate leading spaces, and eliminate text within the lines of a specific coding key that should not be part of the key. Note that Word and most word processors have means to replace formatting characters such as tabs (in Word 2007, at the bottom of the find and replace box click on "More" and then with the Find tab open click on Format ). You then use Save As to save the modified coding key document in a text format. When you Save AS you must first selected Plain text (.txt) in the dropdown box of formats in Word (or perhaps ASCII format in other word processors). If Word asks, "Windows Default" is OK.

Next you will want to open the plain file in a text editor and fix any problems. Things may show up in the text editor that you don't see in the word processor's view. For Windows machines the supplied text editor is Notepad which is found in the Accessories folder of All Programs. (There are a other much better ones free on the web. One that I have used is http://download.jgsoft.com/editpad/SetupEditPadLite.exe.) Once the document is in the proper shape, be sure to save it (as a txt file is fine).

If your document includes multiple coding sheets (i.e. for multiple columns) then it is easiest to cut and paste each subset into the appropriate tDAR coding sheet. Alternately you can save each set of lines as a separate file (which will probably require deleting all the lines above and below and using Save As, saving it (with a new name so you don't lose your work) with a csv extension, such as "species codes.csv". In this case you use the upload rather than text box option in tDAR.

To convert a pdf document (that has been character recognized), the process is similar. You will want to select all the text in the document, then open a text editor such as Notepad (see above) and paste the text into it. You can then use search and replace and edit the document into the proper format as you would have done, above, with your word processor.

Once You have Submitted the Coding Sheet

Be sure to look at the result. If you had a comma inside a value, you will see that the comma is gone and the text after the comma now appears as the description. You will need to go back and enclose the value in double quotes or eliminate the comma.

Note for more Geeky Types

If you have a lot of consistently lines that need to be converted, you may want to find a fancier text editor that deals with regular expressions that will allow MUCH more complex search and replace operations. I use TextPad ($16.50) that I like a lot http://www.textpad.com. For example, you can do a single search and replace operation that would convert lines of the form:

Bunny (sylvilagus)      010
Jackrabbit (lepus)       020

To a proper coding sheet form:

10,sylvilagus sp. (bunny)
20,lepus sp. (jackrabbit)

However, there is a learning curve here. 

  • No labels