Choosing Supported File Formats

Identify Your File Formats

tDAR supports a number of commonly used file types, but no repository supports all file types. Review the data you plan to submit and make a list of the file type (e.g. document, image, datasets, etc.), the program used to create each file (e.g. Microsoft Word, Adobe Illustrator), and especially the file extension (i.e. usually the three to four letters at the end of the file name: surveyReport .doc, faunalData .xlsx).

Verify That tDAR Supports Your Formats

Table 1 contains information about which popular file formats tDAR supports. If your file type is not listed in the field “tDAR Supported File Types” then you will need to convert your files before submitting them to tDAR.

Converting Files         

File conversion tools can be found by searching the web using a phrase similar to “converting X file type to Y file type” where X is the file extension your data are in and Y is the file extension tDAR supports for that same resource type. 

Keep in mind that conversion tools may not translate your data perfectly. Be sure to compare the converted files to be sure they contain the same pertinent information as the original file.

Noting any file conversions in a resource’s “General Notes” metadata field is an excellent practice.

Table 1. tDAR Supported File Types 

Resource Type

Common File Types

tDAR Supported File Types

Documents

Microsoft (MS) Word (.doc, .docx)
OpenOffice Writer (.odt)
PDF (.pdf)
Corel Wordperfect (.wpd)
Rich Text Format (.rtf)
Plain Text (.txt)

PDF(.pdf, .pdfa)
MS Word (.doc, .docx)
Plain Text (.txt)

Images

Bitmap (.bmp)
JPEG (.jpg)
GIF (.gif)
PNG (.png)
TIFF(.tif)
Adobe Photoshop (.psd)
Adobe Illustrator (.ai)
Encapsulated PostScript (.eps)
Scalable Vector Graphics (.svg)

JPEG (.jpg)
Bitmap (.bmp)
TIFF (.tif)
PNG (.png)
GIF (.gif)

Datasets

MS Excel (.xls, .xlsx)
MS Works (.wks)
MS Access (.accdb, .mdb, .mdbx)
OpenOffice Calc (.ods)
OpenOffice Base (.odb)
Dbase (.dbf)

MS Excel (.xls, .xlsx)
MS Access (.mdb, .mdbx)
Comma-separated Values (.csv)

Coding Sheets

-----

MS Excel (.xls, .xlsx)
Comma-separated Values (.csv)
Text (entered on tDAR webpage)

Ontologies

-----

Web Ontology Language (.owl)

Sensory Data

-----

JPEG (.jpg)
TIFF (.tif)
ZIP Archive (.zip)
TAR Archive (.tar, .tgz)
GIF (.gif)

Geospatial Files -----

"jpw" or "tfw"

"aux" and "aux.xml"

"ovr" or "rrd"

"shp"

  "shx":

"dbf"

  "prj"

  "sbn", "sbx"

  "fbn", "fbx"

  "ain", "aih"

  "atx"

  "ixs", "mxs"

  "cpg"

  "mdb" and "gdb"

Testing Your Plan

Pilot Ingest

After organizing your data, you may want to test your strategy with a small portion of your data to be sure that your plan will work as expected.

Choose a representative sample of the types of files you will be creating metadata for and uploading. Be sure to include at least one example of each file type.

Upload your sample files and create metadata for the resources. Test the search function to be sure that your resources appear when a user searches for something you include in your metadata.

Try searching, accessing, and downloading any resources you have marked as confidential or embargoed. Check downloaded files to be sure that redacted information cannot be found visually or by using a program’s text search functions.

If everything did not work properly, now is the time to revise your plan and organization strategy and try the pilot ingest again.

If everything works as expected, congratulations! You are ready to begin uploading your data into tDAR, and will have saved yourself a lot of angst and frustration by preparing properly.