Planning Your Data Contribution

Choosing Supported File Formats

Identify Your File Formats

tDAR supports a number of commonly used file types, but no repository supports all file types. Review the data you plan to submit and make a list of the file type (e.g. document, image, datasets, etc.), the program used to create each file (e.g. Microsoft Word, Adobe Illustrator), and especially the file extension (i.e. usually the three to four letters at the end of the file name: surveyReport .doc, faunalData .xlsx).

Verify That tDAR Supports Your Formats

Table 1 contains information about which popular file formats tDAR supports. If your file type is not listed in the field “tDAR Supported File Types” then you will need to convert your files before submitting them to tDAR.

Converting Files         

File conversion tools can be found by searching the web using a phrase similar to “converting X file type to Y file type” where X is the file extension your data are in and Y is the file extension tDAR supports for that same resource type. 

Keep in mind that conversion tools may not translate your data perfectly. Be sure to compare the converted files to be sure they contain the same pertinent information as the original file.

Noting any file conversions in a resource’s “General Notes” metadata field is an excellent practice.

Table 1. tDAR Supported File Types 

Resource Type

Common File Types

tDAR Supported File Types

Documents

Microsoft (MS) Word (.doc, .docx)
OpenOffice Writer (.odt)
PDF (.pdf)
Corel Wordperfect (.wpd)
Rich Text Format (.rtf)
Plain Text (.txt)

PDF(.pdf, .pdfa)
MS Word (.doc, .docx)
Plain Text (.txt)

Images

Bitmap (.bmp)
JPEG (.jpg)
GIF (.gif)
PNG (.png)
TIFF(.tif)
Adobe Photoshop (.psd)
Adobe Illustrator (.ai)
Encapsulated PostScript (.eps)
Scalable Vector Graphics (.svg)

JPEG (.jpg)
Bitmap (.bmp)
TIFF (.tif)
PNG (.png)
GIF (.gif)

Datasets

MS Excel (.xls, .xlsx)
MS Works (.wks)
MS Access (.accdb, .mdb, .mdbx)
OpenOffice Calc (.ods)
OpenOffice Base (.odb)
Dbase (.dbf)

MS Excel (.xls, .xlsx)
MS Access (.mdb, .mdbx)
Comma-separated Values (.csv)

Coding Sheets

-----

MS Excel (.xls, .xlsx)
Comma-separated Values (.csv)
Text (entered on tDAR webpage)

Ontologies

-----

Web Ontology Language (.owl)

Sensory Data

-----

JPEG (.jpg)
TIFF (.tif)
ZIP Archive (.zip)
TAR Archive (.tar, .tgz)
GIF (.gif)

Geospatial Files -----

"jpw" or "tfw"

"aux" and "aux.xml"

"ovr" or "rrd"

"shp"

  "shx":

"dbf"

  "prj"

  "sbn", "sbx"

  "fbn", "fbx"

  "ain", "aih"

  "atx"

  "ixs", "mxs"

  "cpg"

  "mdb" and "gdb"

Managing Security and Access

 

Confidentiality of Resources

Marking a Resource as Confidential

When you mark a file as containing confidential information, the file itself will never be accessible to the public. The file's metadata remains visible, but the file itself is not visible and cannot be downloaded (unless you give access rights to a specific tDAR user).

To mark a file as confidential, select "Confidential" from the drop down menu with the sub-heading "This item has access restrictions".

Why would I mark a resource as "Confidential"? You may choose to mark a resource as confidential if you feel that it contains sensitive data that could endanger an archaeological resource, information that affiliated communities or other interested communities might not wish to be widely available, or information that you are not prepared to share. For example, you may choose to mark a file that contains mortuary features as confidential to respect the wishes of affiliated communities to restrict access to this information. This data should likely remain restricted to professional bioarchaeologists and others who will treat the information with proper respect. 

Redacting Confidential Data

tDAR automatically generalizes specific locations set using the website’s “Select Region” tool, but uploaders should be careful to review their resources to ensure that site locations and other sensitive information are redacted from all text, images or other fields where they might appear.

  • For more information on redacting information, see Redaction.

Setting Access Permissions

Files can be marked as confidential, restricting access to only users the uploader specifies but allowing all users to see the associated metadata.

Embargoing Resources

Resources can be uploaded to tDAR with an embargo that keeps the resource private until a specified future date. Users can allows other specified users access to embargoed files before the embargo ends (i.e. such as when multiple researchers want to use tDAR to collaborate on a data set without releasing it publicly just yet).

 

When you mark a file as embargoed, you are restricting access to the file for 5 years. In other words, the file will not be accessible to the public for the next 5 years. The file's metadata will be visible during that period, but the file itself is not visible and cannot be downloaded. After the embargo period has ended, the file will become accessible to the public. 

To mark a file as embargoed, select "Embargoed" from the drop down menu with the sub-heading "This item has access restrictions".

Why would I mark a resource as "Embargoed"? You may choose to mark a file as embargoed to restrict access to the resource for a limited period of time. For example, you may wish to register a file with tDAR that houses data for an ongoing research project. You would like to store the data and share it with a select group of colleagues working with you on the research project. However, this data must remain restricted until the project is complete and results are published in some fashion. You can mark this resource as embargoed to indicate that it is restricted for a period of time before it can be made available to the public. 

Organizing Your Data

Before submitting your data to tDAR, organize your resources into groups based on some commonality. Useful organization strategies include grouping resources by:

  • Archaeological projects (e.g. Big Bend River Survey, Excavation at 45WH30).
  • Reports/Articles and related resources, such as datasets.

Projects

In tDAR, Projects are a special kind of resource that group otherwise disparate resources together.

Projects are not necessarily analogous to archaeological excavations, but for discrete archaeological projects, it makes sense to group together excavation photos, site reports, level forms and artifact measurements under a tDAR Project named for the archaeological project.

The advantage of using a project to organize your resources is twofold:

Inheritance

Projects allow users to set general metadata at the project level. Resources that are grouped under a Project will “inherit” the Project-level metadata automatically, saving users from having to enter repetitious metadata at the Resource level. Resource level metadata can be customized for each resource, allowing more specific information to be used for individual files or resources.

For example, if all your resources deal with the period from 200-1450 A.D., you can set the Project “Temporal Coverage” field accordingly. Now every resource added to the project will be able to inherit the “Temporal Coverage” field from the Project, without needing to re-enter “200-1450 A.D.” for every resource. When a project is updated, all of the associated individual resources are automatically updated as well.

Organization

Projects allow users to move from the Resource level and find other resources from the same project.

For example, a user reading the Kennewick Man Cultural Affiliation Report  might want to also see the DNA testing results. Seeing that the report is grouped under the Project title: “The Archaeology of Kennewick Man,” upon clicking on the Project title, the user is able to other associated resources such as letters from the U.S. Secretary of the Interior to the U.S. Army Corps of Engineers, radiocarbon dating results and -eureka!- the DNA testing results!

Collections

Collections are a convenient way to organize and display resources and to more easily manage permissions on groups of resources.

Collections can be stacked or nested to allow you to group and embed projects, independent resources, and other collections. As the diagram below shows, you can place any combination of projects, resources, and collections under a parent collection.

Testing Your Plan

Pilot Ingest

After organizing your data, you may want to test your strategy with a small portion of your data to be sure that your plan will work as expected.

Choose a representative sample of the types of files you will be creating metadata for and uploading. Be sure to include at least one example of each file type.

Upload your sample files and create metadata for the resources. Test the search function to be sure that your resources appear when a user searches for something you include in your metadata.

Try searching, accessing, and downloading any resources you have marked as confidential or embargoed. Check downloaded files to be sure that redacted information cannot be found visually or by using a program’s text search functions.

If everything did not work properly, now is the time to revise your plan and organization strategy and try the pilot ingest again.

If everything works as expected, congratulations! You are ready to begin uploading your data into tDAR, and will have saved yourself a lot of angst and frustration by preparing properly.