MIARE: Minimum Information About an RNAi Experiment : GuidanceNotesV080

HomePage :: Categories :: PageIndex :: RecentChanges :: Login/Register

MIARE-Tab Guidance Notes v0.8.0 – May 2011

PubChem BioAssay Description CSV Tags

1. General Description Items
Definitions of the individual items in the bioassay description.

PUBCHEM_EXT_DATASOURCE_REGID
Required. The depositor's own unique identifier for the deposited bioassay.  It must be unique across all data deposited by you. If you provide an external ID that is not unique to your depositions, it will be treated as an update request that will replace the existing bioassay record in PubChem with the data you provide in the bioassay record.

PUBCHEM_ASSAY_NAME
Required. Name of the assay.

PUBCHEM_GRANT_NUMBER

PUBCHEM_PROJECT_CATEGORY
Enter RNAI_GLOBAL_INITIATIVE for projects submitted by RNAi Global Initiative members.

PUBCHEM_ACTIVITY_OUTCOME_METHOD
Accepted values for this tag: PRIMARY_SCREENING, CONFIRMATORY, SUMMARY, OTHER

PUBCHEM_SUBSTANCE_TYPE
Required. Enter 'NUCLEOTIDE' for an RNAi screen.

PUBCHEM_HOLD_UNTIL_DATE
Date the deposited bioassay is to be made available through the public PubChem database. The maximum permitted on-hold time is 12 months. This is an NIH/NLM policy. The expected format is: YYYY-MM-DD (e.g. 2009-07-16). If not provided, the deposition will be made available immediately to the public.

PUBCHEM_ASSAY_DESCRIPTION
Enter PubChem bioassay description as tag/value pairs in the MIARE sheet.

PUBCHEM_ASSAY_PROTOCOL
Enter PubChem protocol description as tag/value pairs in the MIARE sheet.

PUBCHEM_ASSAY_COMMENTS
Enter PubChem comments description as tag/value pairs in the MIARE sheet.


2. Result Definitions
Definitions of the column headers present on the data CSV file.

RESULT_ID
Required. A sequentially increasing integer ID starting from one.

RESULT_NAME
Required. Must exactly match the name and order of the column headers as they appear in the data CSV file after column 5, PUBCHEM_ASSAYDATA_COMMENT.

RESULT_TYPE
Required. The data type of each column header in the data CSV file. Accepted values for this tag:
FLOAT, INTEGER, BOOLEAN, STRING, PUBCHEM_NCBI_PUBMED_ID, PUBCHEM_EXT_URL, PUBCHEM_NCBI_NUCLEOTIDE_GI, PUBCHEM_NCBI_GENE_ID, PUBCHEM_NCBI_PROBE_ID, PUBCHEM_SID, TARGET_NCBI_TAXONOMY_ID, TARGET_NCBI_GENE_ID

RESULT_DESCR
Description of the data column header e.g. ‘Confirmed by deconvoluted siRNA pools’ for ‘Hit Confirmation’.

RESULT_UNIT
Required. The data units for each column header in the data CSV file. Accepted values for this tag:
PPT, PPM, PPB, MILLIMOLAR, MICROMOLAR, NANOMOLAR, PICOMOLAR, FEMTOMOLAR, MILLIGR_PER_ML, MICROGR_PER_ML, NANOGR_PER_ML, PICOGR_PER_ML, FEMTOGR_PER_ML, MOLAR, PERCENT, RATIO, SECONDS, RECIPROCAL_SECONDS, MINUTES, RECIPROCAL_MINUTES, DAYS, RECIPROCAL_DAYS, OTHER, NONE, UNSPECIFIED


3. XRefs
Cross references to relevant information in other databases. For example, PubChem AIDs of other related BioAssays.

XREF_TYPE
Required. Accepted XREF_TYPEs (relevant only):
PUBCHEM_NCBI_TAXONOMY_ID, PUBCHEM_NCBI_PUBMED_ID, PUBCHEM_NCBI_OMIM_ID, PUBCHEM_AID

XREF_VALUE
Required.

XREF_ANNOTATION
Specific description of the tag e.g. PUBCHEM_NCBI_TAXONOMY_ID / human sequences


4. MIARE (Categorised Comments)
MIARE-specific tag/value pairs (entered under CAT_COMMENT_TAG and CAT_COMMENT_VALUE columns respectively) that are stored in the assay record as comments. All such comments are searchable in PubChem.

See MIARE checklist for the full-list of tag/value pairs.


PubChem Substance Description CSV Tags

5. Substance

Column 1: PUBCHEM_EXT_DATASOURCE_REGID
Required. The depositor's own unique identifier for Substance descriptions. It must be unique across all data deposited by you. If you provide an external ID that is not unique to your depositions, it will be treated as an update request that will replace the existing substance record in PubChem with the data you provide in the substance record. This is the only required field in the substance CSV file.

PUBCHEM_NCBI_GENE_ID
NCBI Entrez Gene ID for a specific RNAi substance.

PUBCHEM_NCBI_PROBE_ID
NCBI Entrez Probe ID for a specific RNAi substance.

PUBCHEM_SUBSTANCE_COMMENT
Textual annotations, such as comments on the source of the reagent sample or the name of the gene target of the siRNA, may optionally be provided for substance data. This can be found through exact or keyword text searches.

PUBCHEM_NCBI_TAXONOMY_ID
If the list of substances is not derived from a single organism, an NCBI Taxonomy ID may optionally be provided for a specific substance to indicate the source organism.

PUBCHEM_HOLD_UNTIL_DATE
Date deposited substance data is to be made available through the public PubChem database. The maximum permitted on-hold time is 12 months. This is an NIH/NLM policy. The expected format is: YYYY-MM-DD (e.g. 2009-07-16). If not provided, the deposition will be made available immediately to the public.


PubChem Assay Data Description CSV Tags

6. Data

The CSV column ordering for the first five columns is fixed and must be exactly as documented below. Beyond that, there must be a column for each result defined in the description.

Click on the "CSV Template" link (in the Add Data View only) to download a CSV template file using the Assay Description that has been entered. This is a guide so that you can cut and paste your data into this CSV file while strictly maintaining the correct number of columns. For fields without data there will be nothing but consecutive commas. There is also an example CSV file with data. The CSV data file must either have no column headers or these automatically generated headers; any deviations will cause errors.

The following fixed columns are expected in your CSV file. Optional fields for which there is no data available should be left empty. Column headers and their order in the data file(s) should exactly match the names and order of the result definitions.

Note: Substance descriptions must be deposited in PubChem prior to depositing assay descriptions and data.

Column 1: PUBCHEM_SID
Required. The Substance identifier (SID) is a whole number generated by PubChem after the substance list has been deposited. If substances are identified by their PubChem SID, leave PUBCHEM_EXT_DATASOURCE_REGID blank.

Column 2: PUBCHEM_EXT_DATASOURCE_REGID
Required. The depositor's own unique identifier for Substance descriptions previously loaded into either PubChem or the PubChem deposition system. If you provide a value in this column, you must set the value in Column 1 to '0' (zero) or leave it blank.

Column 3: PUBCHEM_ACTIVITY_OUTCOME
Required. The outcome for each Substance is represented by one of five values:

1 - Substance is considered inactive.
2 - Substance is considered active.
3 - Substance activity outcome is inconclusive.
4 - Substance activity outcome is unspecified.
5 - Substance identified as a probe (only allowed in summary assays).

Column 4: PUBCHEM_ACTIVITY_SCORE
The score for each Substance is a whole number where larger values are more active. Scores are expected to be on a linear scale, so should be transformed accordingly. Although not an absolute requirement, the range should preferably be adjusted to 0-100, however larger and smaller values are allowed. The score values are used to allow PubChem users to partition, sort, and profile Assay Data results within and between biological assays.

Column 5: PUBCHEM_ASSAYDATA_COMMENT
Textual annotation and comments may optionally be provided for Assay Data reported for this Substance in this column. This can be found through exact or keyword text searches.

Column 6: Target Gene ID
Required. NCBI Entrez Gene ID

Column 7: Nucleotide GI
Recommended. Sequence identifier of type PUBCHEM_NCBI_NUCLEOTIDE_GI

Columns 8 and higher (one column per result definition):
All remaining columns are an order dependent one-to-one correspondence between the result definitions defined in the associated Assay Description. All defined columns must be present; however, values are optional in individual fields. Consult the auto-generated CSV template file with your description information to see the layout.

 


[Return to the top of this page]

There are no comments on this page. [Add comment]

Valid XHTML 1.0 Transitional :: Valid CSS :: Powered by Wikka Wakka Wiki 1.1.6.2
Page was generated in 1.1736 seconds