Index of /chromosomal_feature

Icon  Name                         Last modified      Size  Description
[DIR] Parent Directory - [TXT] SGD_CDS_xref.txt 03-Feb-2010 15:26 649K [TXT] SGD_features.tab 06-Feb-2010 06:36 2.6M [   ] Tachibana2005.gff 16-Oct-2006 16:21 22K [TXT] annotation_change.tab 06-Feb-2010 07:58 279K [DIR] archive/ 06-Feb-2010 08:14 - [   ] chromosome_length.tab 06-Feb-2010 08:02 334 [   ] clone.tab 06-Feb-2010 07:58 32K [   ] dbxref.tab 06-Feb-2010 06:37 2.8M [   ] genetic_map.tab 06-Feb-2010 07:58 272K [TXT] saccharomyces_cerevisiae.gff 09-Feb-2010 19:52 18M [TXT] scerevisiae_clonedata.gff 09-Feb-2010 19:52 196K [TXT] scerevisiae_regulatory.gff 09-Feb-2010 19:52 554K [   ] scerevisiae_sage.gff 15-Feb-2006 11:28 2.5M
This directory contains information about S. cerevisiae chromosomal
features annotated in SGD, with files including chromosomal
coordinates, ORF name to ID mappings, annotation changes, etc.

For the Oracle database schema and specifications for SGD, see:

  http://db.yeastgenome.org/schema/SgdSchema.html

Note that these files do not correspond exactly to the database tables
described above so that we can provide more complete information
within individual files.

The archive/ subdirectory contains previous versions of these files.


==============================================================================

SGD_features.tab

This file replaced the previous chromosomal_feature.tab file. This file
is updated weekly (Saturday).
Highlights of the changes include:

1. It contains information on current chromosomal features in SGD,
including information about Dubious ORFs. It also contains the
coordinates of intron, exons, and other subfeatures that are located
within a chromosomal feature.

2. The relationship between subfeatures and the feature in which they
are located is identified by the feature name in column #7 (parent
feature). For example, the parent feature of the intron found in
ACT1/YFL039C will be YFL039C. The parent feature of YFL039C is
chromosome 6.

3. The coordinates of all features are in chromosomal coordinates.

4. Replacement of several feature types to be more consistent with
Genbank files and other model organism databases. ORF is now gene,
exon is now CDS.

Columns within SGD_features.tab:

1.   Primary SGDID (mandatory)
2.   Feature type (mandatory)
3.   Feature qualifier (optional)
4.   Feature name (optional)
5.   Standard gene name (optional)
6.   Alias (optional, multiples separated by |)
7.   Parent feature name (optional)
8.   Secondary SGDID (optional, multiples separated by |)
9.   Chromosome (optional)
10.  Start_coordinate (optional)
11.  Stop_coordinate (optional)
12.  Strand (optional)
13.  Genetic position (optional)
14.  Coordinate version (optional)
15.  Sequence version (optional)
16.  Description (optional)

The SGD_features.tab file is complemented by the GFF3 file, see below,
called saccharomyces_cerevisiae.gff

==============================================================================

annotation_change.tab :

Contains information about annotation changes to the chromosomal
features.  This file lists features that have been either removed from
the SGD or merged into another feature.  In the case of merged
features, the merged feature information is in the first 8 columns of
the file, and the information about the feature currently in the
dataset is in columns 9 through 12.  

This file lists all changes that are made to the FeatureType of a 
chromosomal feature.  Therefore, an ORF that has been made Dubious will be
listed in the fiile.

The Date column is the date that the annotation change occurred. This 
file is updated weekly (Saturday). 

The columns are:

1) Merged or Deleted Feature (mandatory)
2) FeatureType (mandatory, multiples separated by |)
3) Chromosome (mandatory)
4) StartCoord (optional)
5) StopCoord (optional)
6) Strand (mandatory)
7) SGDID (optional)
8) SecondarySGDID (optional, multiples separated by |)
9) CurrentFeature (optional)
10) SGDID (optional)
11) Description (optional)
12) Note (optional, multiples separated by |)
13) Date (mandatory)

==============================================================================

clone.tab : 

Contains information about yeast clones from Washington Unversity in
St. Louis and the ATCC. It is updated weekly (Saturday), though the
underlying data are rarely altered/updated. The columns are:

1) ATCC clone name (optional)
2) Washington University clone name (optional)
3) Chromosome (mandatory)
4) Start coordinate (mandatory)
5) Stop coordinate (mandatory)

scerevisiae_clonedata.gff:

Contains the above information in the Generic Feature Format Version 3
(http://song.sourceforge.net/gff3.shtml).

==============================================================================

The following two files map various other IDs to ORF names.  The SGDID
is the recommended identifier for features in SGD.

==============================================================================

dbxref.tab :

Maps ORF names and SGDIDs to other IDs, including SwissProt, EC, etc.  Currently, 
NCBI GI numbers are not included but NCBI DNA, protein, and RefSeq accession IDs 
are included.  Please see below for more details.

This file contains all ORFs.  Updated weekly (Saturday). 

Columns are:

1) DBXREF ID
2) DBXREF ID source
3) DBXREF ID type
4) S. cerevisiae feature name
5) SGDID

A description of the IDs currently represented in this file.

DBXREF ID source: Candida DB
DBXREF ID type: Gene ID
Description: Gene ID of Candida albicans ortholog of the S. cerevisiae gene, 
from CandidaDB at the Institute Pastuer.
Corresponding URL: http://genolist.pasteur.fr/CandidaDB/

DBXREF ID source: DIP
DBXREF ID type: Gene ID
Description: ID of the S. cerevisiae ORF used by the Database of Interacting 
Proteins (DIP)
Corresponding URL: http://dip.doe-mbi.ucla.edu/dip/

DBXREF ID source: EUROSCARF
DBXREF ID type: Gene ID
Description: S. cerevisiae ORF name used at the European Saccharomyces cerevisiae 
Archive for Functional Analysis (EUROFAN), source for yeast deletion strains in Europe
Corresponding URL: http://www.uni-frankfurt.de/fb15/mikro/euroscarf/

DBXREF ID source: GermOnline
DBXREF ID type: Gene ID
Description: ID of the S. cerevisiae ORF at GermOnline, a database of germ
cell growth and gametogenesis.
Corresponding URL: http://germonline.yeastgenome.org/

DBXREF ID source: IUBMB
DBXREF ID type: EC number
Description: EC number of the reaction catalyzed by the S. cerevisiae protein.  EC
numbers are assigned to reactions by the Internal Union of Biochemistry and
Molecular Biology.
Corresponding URL: http://www.expasy.ch/cgi-bin/nicezyme.pl?

DBXREF ID source: MetaCyc
DBXREF ID type: Pathway ID
Description: ID of the pathway in which the S. cerevisiae protein participates.
Corresponding URL: http://pathway.yeastgenome.org:8555/server.html

DBXREF ID source: NCBI
DBXREF ID type: DNA version ID
Description: ID representing an NCBI (DDBJ/EMBL-bank/GenBank) accession number 
for the DNA sequence of an S. cereivsiae chromosomal feature.
Corresponding URL: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide

DBXREF ID source: NCBI
DBXREF ID type: Gene ID
Description: ID of the S. cerevisiae ORF at the NCBI Entrez Gene database.
Corresponding URL: http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=gene

DBXREF ID source: NCBI
DBXREF ID type: Protein version ID
Description: ID representing an NCBI (DDBJ/EMBL-bank/GenBank) accession 
number for the protein sequence of an S. cerevisiae ORF.
Corresponding URL: http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Protein

DBXREF ID source: NCBI
DBXREF ID type: RefSeq Accession
Description: ID of the S. cerevisiae chromosome sequence in the NCBI RefSeq 
database.
Corresponding URL: http://www.ncbi.nlm.nih.gov/RefSeq/

DBXREF ID source: NCBI
DBXREF ID type: RefSeq protein version ID
Description: ID of the S. cerevisiae protein sequence in the NCBI RefSeq 
database.
Corresponding URL: http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Protein

DBXREF ID source: SIB
DBXREF ID type: Swiss-Prot ID
Description: ID representing the accession number at the Swiss-Prot protein database
for a S. cereivsiae protein.
Corresponding URL: http://us.expasy.org/sprot/



==============================================================================

SGD_CDS_xref.txt :

This space-delimited file contains SGD xref update information for CDS
entries in GenBank/EMBL/DDBJ and is being updated every two months.

Columns are:

1) Accession Number
2) PROTEIN_ID of CDS
3) SGDID
4) Standard S. cerevisiae ORF Name

==============================================================================

genetic_map.tab :

Contains genetic mapping data submitted to SGD; this file is updated
weekly (Saturday), though additions/changes to these data are rare.

Columns are:

1) two point experiment name
2) parental ditype
3) nonparental ditype
4) tetratype
5) first division
6) second division
7) distance
8) standard error
9) interference
10) interference standard error
11) note
12) gene1
13) gene1 ORF name
14) gene1 chromosome
15) gene1 genetic position
16) gene1 sgdid
17) gene2
18) gene2 ORF name
19) gene2 chromosome
20) gene2 genetic position
21) gene2 sgdid

==============================================================================

chromosome_length.tab :

Columns are:

1) chromosome
2) NCBI RefSeq accession number
3) length

Note that chromosome 17 is the mitochondrial chromosome.

==============================================================================

saccharomyces_cerevisiae.gff :

This file contains sequence features of Saccharomyces cerevisiae and
related information such as GO annotation. It is fully compatible with
Generic Feature Format Version 3
(http://song.sourceforge.net/gff3.shtml). It is updated nightly.

This is a standard format used by many groups.  It is used by SGD to
load the GBROWSE resource.

NOTE: A resgen_primers.gff file containing primer sequences and sequence 
coordinates is available from the following ftp directory:

ftp://ftp.yeastgenome.org/yeast/sequence/primer_sequences/


==============================================================================
scerevisiae_regulatory.gff:

This file contains transcription factor binding site data published by
Harbison, et al. (2004) Transcriptional regulatory code of a eukaryotic
genome. Nature 431(7004):99-104 [PMID:15343339].  It is full
compatible with the Generic Feature Format Version 3
(http://song.sourceforge.net/gff3.shtml).

This is a standard format used by many groups.  These sequence
features can also be viewed in the GBROWSE genome browser at SGD.

==============================================================================
scerevisiae_sage.gff

This file contains the sage data from the S. cerevisiae SAGE study published by
Velculescu et al. (1997) "Characterization of the yeast transcriptome" Cell 88:243-251.  
It is fully compatible with the Generic Feature Format Version 3 
(http://song.sourceforge.net/gff3.shtml).

This is a standard format used by many groups.  These sequence
features can also be viewed in the GBROWSE genome browser at SGD.

==============================================================================
Tachibana2005.gff

This file contains transcription factor binding site data published by
Tachibana, et al. (2005) Combined global localization analysis and
transcriptome data identify genes that are directly coregulated by
Adr1 and Cat8. Mol Cell Biol 25(6):2138-46 [PMID:15743812].  It is
full compatible with the Generic Feature Format Version 3
(http://song.sourceforge.net/gff3.shtml).

This is a standard format used by many groups.  These sequence
features can also be viewed in the GBROWSE genome browser at SGD.

==============================================================================

******************************************************************************
******************************************************************************
*****  T H E   F O L L O W I N G   F I L E S   A R E   O B S O L E T E  ******
******************************************************************************
******************************************************************************
==============================================================================

external_id.tab : 

This file is now obsolete and has been replaced with dbxref.tab,
see below.  Last updated August 28, 2004.

September 2004: This file has been moved to the directory: data_download/obsolete/

==============================================================================

chromosomal_feature.tab : 

This file is now obsolete and has been replaced with SGD_features.tab,
see below.  Last updated August 28, 2004.

September 2004: This file has been moved to the directory: data_download/obsolete/

==============================================================================

chromosomal_feature.last_week :

This file is used by internal scripts and was just a copy of the
chromosomal_feature.tab file.  This file is no longer provided.

==============================================================================

chromosomal_feature.previous_format :

This file is the last version of the previous format of the
chromosomal_feature.tab file.  This format only contained the first 12
columns described above.  This file was updated last updated at July
15, 2002.

November 2003: This file was moved to the directory: data_download/obsolete/

==============================================================================

intron_exon.tab :

September 2004: This file has been moved to the directory: data_download/obsolete/

==============================================================================