1 | packageIdPattern | valid |
Type: | metadata |
System: | lter |
On failure: | error |
| packageId pattern matches "scope.identifier.revision" | Check against LTER requirements for scope.identifier.revision | 'scope.n.m', where 'n' and 'm' are integers and 'scope' is one of an allowed set of values | edi.0.3 | | | |
2 | emlVersion | valid |
Type: | metadata |
System: | lter |
On failure: | error |
| EML version 2.1.0 or beyond | Check the EML document declaration for version 2.1.0 or higher | eml://ecoinformatics.org/eml-2.1.0 or eml://ecoinformatics.org/eml-2.1.1 | eml://ecoinformatics.org/eml-2.1.1 | Validity of this quality report is dependent on this check being valid. | | |
3 | schemaValid | valid |
Type: | metadata |
System: | knb |
On failure: | error |
| Document is schema-valid EML | Check document schema validity | schema-valid | Document validated for namespace: 'eml://ecoinformatics.org/eml-2.1.1' | Validity of this quality report is dependent on this check being valid. | | |
4 | parserValid | valid |
Type: | metadata |
System: | knb |
On failure: | error |
| Document is EML parser-valid | Check document using the EML IDs and references parser | Validates with the EML IDs and references parser | EML IDs and references parser succeeded | Validity of this quality report is dependent on this check being valid. | | |
5 | schemaValidDereferenced | valid |
Type: | metadata |
System: | lter |
On failure: | error |
| Dereferenced document is schema-valid EML | References are dereferenced, and the resulting file validated | schema-valid | Dereferenced document validated for namespace: 'eml://ecoinformatics.org/eml-2.1.1' | Validity of this quality report is dependent on this check being valid. | | |
6 | keywordPresent | warn |
Type: | metadata |
System: | lter |
On failure: | warn |
| keyword element is present | Checks to see if at least one keyword is present | Presence of one or more keyword elements | 0 'keyword' element(s) found | The LTER portal allows searches on keywords. This check is a precursor for checking on keywords from the controlled vocabulary. | Add at least one keyword. | |
7 | methodsElementPresent | valid |
Type: | metadata |
System: | lter |
On failure: | warn |
| A 'methods' element is present | All datasets should contain a 'methods' element, at a minimum a link to a separate methods doc. | presence of 'methods' at one or more xpaths. | 2 'methods' element(s) found | | | EML Best Practices, p. 28 |
8 | coveragePresent | warn |
Type: | metadata |
System: | lter |
On failure: | warn |
| coverage element is present | At least one coverage element should be present in a dataset. | At least one of geographicCoverage, taxonomicCoverage, or temporalCoverage is present in the EML. | 0 'coverage' element(s) found | | | |
9 | geographicCoveragePresent | info |
Type: | metadata |
System: | lter |
On failure: | info |
| geographicCoverage is present | Check that geographicCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level. | geographicCoverage at least at the dataset level. | 0 'geographicCoverage' element(s) found | Many but not all datasets are appropriate to have spatial coverage. | If sampling EML is used within methods, does that obviate geographicCoverage? Or should those sites be repeated or referenced? | EML Best Practices v.2, p. 22-23. "One geographicCoverage element should be included, whose boundingCoordinates describe the extent of the data....Additional geographicCoverage elements may be entered at the dataset level if there are significant distances between study sites and it would be confusing if they were grouped into one bounding box." 6 decimal places. |
10 | taxonomicCoveragePresent | info |
Type: | metadata |
System: | lter |
On failure: | info |
| taxonomicCoverage is present | Check that taxonomicCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level. | taxonomicCoverage at least at the dataset level. | 0 'taxonomicCoverage' element(s) found | Only when taxa are pertinent to the dataset will they have taxonomicCoverage. | Could search title, abstract, keywords for any taxonomic name (huge). Could search keywordType="taxonomic". | EML Best Practices v.2, p. 25 |
11 | temporalCoveragePresent | info |
Type: | metadata |
System: | lter |
On failure: | info |
| temporalCoverage is present | Check that temporalCoverage exists in EML at the dataset level, or at least one entity's level, or at least one attribute's level. | temporalCoverage at least at the dataset level. | 0 'temporalCoverage' element(s) found | LTER wants to search datasets by time; the best place to search is the dataset level temporal coverage. | Most datasets have a temporal range. | EML Best Practices v.2, p. 24 |
12 | pastaDoiAbsent | valid |
Type: | metadata |
System: | lter |
On failure: | error |
| An alternateIdentifier with a DOI system attribute that looks like it is generated by PASTA should not be present | Reject the data package if it contains an alternateIdentifier DOI that looks like PASTA generated it. | No PASTA DOIs are expected to be found in the uploaded data package | No PASTA DOI alternateIdentifier elements found | PASTA DOI values might appear in an uploaded data package (by various mechanisms). PASTA will assign a DOI after the upload has completed successfully, so an initial one should not be there. | | |
13 | titleLength | valid |
Type: | metadata |
System: | lter |
On failure: | warn |
| Dataset title length is at 5 least words. | If the title is shorter than 5 words, it might be insufficient. Title word count between 7 and 20 including prepositions and numbers. | Between 7 and 20 words | 5 words found. | | | EML Best Practices, v.2, p. 13 |
14 | pubDatePresent | warn |
Type: | metadata |
System: | lter |
On failure: | warn |
| 'pubDate' element is present | Check for presence of the pubDate element | The date that the dataset was submitted for publication in PASTA must be included.
(The EML schema does not require this element, but when present, it does constrain its
format to YYYY-MM-DD or just YYYY. Citation format uses only the YYYY portion even if a
full date is entered.)
| pubDate not found | 'pubDate is part of citation'. 'pubDate' qualifies use of "ongoing" in other metadata elements. | The year of public release of data online should be listed as the 'pubDate'
element. The 'pubDate' should be updated when data and/or metadata are updated or re-released.
The format can be either a 4-digit year (YYYY), or an ISO date (YYYY-MM-DD). | EML Best Practices v.2, p. 17 |
15 | datasetAbstractLength | warn |
Type: | metadata |
System: | lter |
On failure: | warn |
| Dataset abstract element is a minimum of 20 words | Check the length of a dataset abstract and warn if less than 20 words. | An abstract is 20 words or more. | 0 words found. | An abstract helps a user determine if the dataset is useful for a specific purpose. An abstract is usually a paragraph. | Add an abstract. | EML Best Practices |
16 | duplicateEntityName | valid |
Type: | metadata |
System: | lter |
On failure: | error |
| There are no duplicate entity names | Checks that content is not duplicated by other entityName elements in the document | entityName is not a duplicate within the document | No duplicates found | Data Manager requires a non-empty, non-duplicate entityName value for every entity | Declare a non-empty entityName and ensure that there are no duplicate entityName values in the document | |
1 | entityNameLength | valid |
Type: | metadata |
System: | knb |
On failure: | warn |
| Length of entityName is not excessive (less than 100 char) | length of entity name is less than 100 characters | entityName value is 100 characters or less | 16 | | | |
2 | entityDescriptionPresent | valid |
Type: | metadata |
System: | lter |
On failure: | warn |
| An entity description is present | Check for presence of an entity description. | EML Best practices pp. 32-33, "...should have enough information for a user..." | true | With entityName sometimes serving as a file name rather than a title, it is important to be very descriptive here. | | |
3 | numHeaderLinesPresent | info |
Type: | metadata |
System: | knb |
On failure: | info |
| 'numHeaderLines' element is present | Check for presence of the 'numHeaderLines' element. | Document contains 'numHeaderLines' element. | 'numHeaderLines' element: 0 | | | |
4 | numFooterLinesPresent | info |
Type: | metadata |
System: | knb |
On failure: | info |
| 'numFooterLines' element is present | Check for presence of the 'numFooterLines' element. | Document contains 'numFooterLines' element. | No 'numFooterLines' element found | If data file contains footer lines, 'numFooterLines' must be specified. | Add 'numFooterLines' element if needed. | |
5 | fieldDelimiterValid | valid |
Type: | metadata |
System: | knb |
On failure: | error |
| Field delimiter is a single character | Field delimiters should be one character only | A single character is expected | , | A valid fieldDelimiter value was found | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#fieldDelimiter |
6 | recordDelimiterPresent | valid |
Type: | metadata |
System: | knb |
On failure: | warn |
| Record delimiter is present | Check presence of record delimiter. Check that the record delimiter is one of the suggested values. | A record delimiter from a list of suggested values: \n, \r, \r\n, #x0A, #x0D, #x0D#x0A | \n | A valid recordDelimiter value was found | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#recordDelimiter |
7 | numberOfRecordsPresence | valid |
Type: | metadata |
System: | knb |
On failure: | warn |
| Is the numberOfRecords element present? | Warn the user if the numberOfRecords check is not present | A number of records element is expected for this entity | numberOfRecords element found | This is a valuable check that we have the correct table. | | |
8 | integrityChecksumPresence | valid |
Type: | metadata |
System: | lter |
On failure: | warn |
| A physical/authentication element is present and specifies a method attribute with a value of MD5 or SHA-1 | Check for presence of a physical/authentication element containing a checksum or hash value for an integrity check (e.g. MD5, SHA-1). Warn if an entity does not have a physical/authentication element, or if none of the physical/authentication elements specify a method attribute with a value of MD5 or SHA-1. | At least one physical/authentication element with a method attribute specifying MD5 or SHA-1 and containing a checksum or hash value that can be used to perform an integrity check on the data. | true | PASTA will use this value to check the integrity of the data it downloads from your site. In addition, PASTA is planning to compare the contributor-supplied checksum/hash value documented
in the physical/authentication element to the checksum/hash value of this entity downloaded from previous revisions of this data package. If PASTA already has a copy of this entity, it will be able
to avoid an unnecessary download of the entity from your site, resulting in faster processing of the new data package revision when you update it in PASTA. | Add a physical/authentication element and store the entity checksum or hash value in it using a method such as MD5 or SHA-1. | |
9 | dateTimeFormatString | valid |
Type: | metadata |
System: | knb |
On failure: | warn |
| dateTime/formatString specified in metadata is from a preferred set of values | Certain features of dateTime data formats are preferred, eg, ISO 8601, 4-digit years.
This check looks at metadata to see if the dateTime format is in that preferred list. | A formatString value that is a member of the preferred set is expected. | YYYY | A preferred format string was found. | Modify the dateTime/formatString, selecting from among the preferred values one that best matches the data format | |
10 | attributeNamesUnique | valid |
Type: | metadata |
System: | knb |
On failure: | warn |
| Attribute names are unique | Checks if attributeName values are unique in the table. Not required by EML. | Unique attribute names. | true | A good table does not have duplicate column names. | | EML Best Practices |
11 | integrityChecksum | valid |
Type: | congruency |
System: | lter |
On failure: | error |
| Compare the metadata checksum for an entity to the checksum of the downloaded entity | Two possible responses: valid if checksums match; error if checksums do not match. | 915a52bd06ef5730ca5ef33dd359380a59c86ef5 | 915a52bd06ef5730ca5ef33dd359380a59c86ef5 | Matching checksums will ensure data integrity during upload to the repository. | If the found integrity hash value does not match the expected integrity hash value, there may have been a loss of integrity in the data download. Check that the hash method and hash value documented in the metadata are the correct values. | |
12 | databaseTableCreated | valid |
Type: | metadata |
System: | knb |
On failure: | error |
| Database table created | Status of creating a database table | A database table is expected to be generated from the EML attributes. | A database table was generated from the attributes description | CREATE TABLE NoneSuchBugCount("fld" TEXT,"year" TIMESTAMP,"sppm2" FLOAT); | | |
13 | displayDownloadData | info |
Type: | data |
System: | knb |
On failure: | info |
| Display downloaded data | Display the first kilobyte of data that is downloaded | Up to one kilobyte of data should be displayed | *** BINARY DATA *** | | | |
14 | urlReturnsData | valid |
Type: | congruency |
System: | knb |
On failure: | error |
| URL returns data | Checks whether a URL returns data. Unless the URL is specified to be function="information", the URL should return the resource for download. | A data entity that matches the metadata | true | | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-resource.html#UrlType |
15 | onlineURLs | valid |
Type: | congruency |
System: | knb |
On failure: | error |
| Online URLs are live | Check that online URLs return something | true | true | Succeeded in accessing URL: file:///home/pasta/local/data/edi.0.3/a9201a0755fc45ae514abb12469c03a0 | | |
16 | examineRecordDelimiter | valid |
Type: | congruency |
System: | knb |
On failure: | warn |
| Data are examined and possible record delimiters are displayed | If no record delimiter was specified, we assume that \r\n is the delimiter. Search the first row for other record delimiters and see if other delimiters are found. | No other potential record delimiters expected in the first row. | No other potential record delimiters were detected. A valid record delimiter was previously detected | | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#recordDelimiter |
17 | displayFirstInsertRow | info |
Type: | data |
System: | knb |
On failure: | info |
| Display first insert row | Display the first row of data values to be inserted into the database table | The first row of data values should be displayed | Blue Field, 1998, 4.5 | | | |
18 | tooFewFields | valid |
Type: | congruency |
System: | knb |
On failure: | error |
| Data does not have fewer fields than metadata attributes | Compare number of fields specified in metadata to number of fields found in a data record | | No errors detected | | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#dataFormat |
19 | tooManyFields | valid |
Type: | congruency |
System: | knb |
On failure: | error |
| Data does not have more fields than metadata attributes | Compare number of fields specified in metadata to number of fields found in a data record | | No errors detected | | | http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-physical.html#dataFormat |
20 | dataLoadStatus | valid |
Type: | congruency |
System: | knb |
On failure: | warn |
| Data can be loaded into the database | Status of loading the data table into a database | No errors expected during data loading or data loading was not attempted for this data entity | The data table loaded successfully into a database | | | |
21 | numberOfRecords | valid |
Type: | congruency |
System: | knb |
On failure: | warn |
| Number of records in metadata matches number of rows loaded | Compare number of records specified in metadata to number of records found in data | 42 | 42 | The expected number of records (42) was found in the data table. | | |
22 | dateFormatMatches | valid |
Type: | congruency |
System: | lter |
On failure: | warn |
| Date format in metadata matches data | dateTime/formatString in attribute metadata is from the preferred list, and the data matches. A non-match generates only a warn. | Format string is preferred, and all data values match the format string | Data values matched the specified formatString. | | Dates should be consistently formatted and match the formatString in metadata. | |