New & Noteworthy

New Sequence, Chromosome, and Contig pages

August 25, 2014

New Sequence pages are now available in SGD for virtually every yeast gene (e.g., HMRA1 Sequence page), and include genomic sequence annotations for the Reference Strain S288C, as well as several Alternative Reference Genomes from strains such as CEN.PK, RM11-1a, Sigma1278b, and W303 (more Alternative References coming soon). Each page includes an Overview section containing descriptive information, maps depicting genomic context in Reference Strain S288C (as shown below) and Alternative Reference strains, as well as chromosomal and relative coordinates in S288C.

The sequence itself includes display options for genomic DNA, coding DNA, or translated protein.

Also available on each Sequence page are links to redesigned S288C Chromosome pages, links to new Contig pages for Alternative Reference Genomes, and a Downloads menu for easy access to DNA sequences of several other industrial strains and environmental isolates. The new Sequence, Chromosome, and Contig pages make use of many of the features you enjoy on other new or redesigned pages at SGD, including graphical display of data, sortable tables, and responsive visualizations. The Sequence pages also provide seamless access to other tools at SGD such as BLAST and Web Primer. Please explore these new pages, accessible via the Sequence tab on your favorite Locus Summary page, and send us your feedback.

Categories: Website changes New Data Data updates Sequence

Shared Domains and Phosphorylation Sites on Protein Pages

June 24, 2014

We have redesigned the Protein page to include a new tabular display of protein domains. This table provides the identifier for each domain and illustrates the respective locations of the domains within the protein. In addition to this new table, the domains are displayed in an interactive network diagram that presents the proteins that share these domains with your protein of interest (see figure below, left).

Another new feature on the Protein page is the display of phosphorylation sites within the protein’s sequence (as curated by BioGRID). This feature is available for both the reference strain S288C and other commonly used S. cerevisae strains, using the pull-down to select the desired strain view (see figure below, right) .

Left: Proteins (gray circles) that share domains (colored squares) with Fas1p (yellow circle). Right: an example of some of the phosphorylation sites in Swe1p (red residues).

Proteins that share domains with Fas1p

Swe1p protein sequence and phophorylation sites highlighted in red.

Categories: New Data Website changes

Explore a Large New Chemogenomics Dataset Via SGD

March 26, 2014

What happens when you cross two comprehensive deletion mutant collections with a library of more than 1800 structurally diverse chemicals? HIP HOP happens. Not the music, but a whole lot of very informative phenotype data – over 40 million data points!

The response of S. cerevisiae mutant strains to a chemical can tell us a lot about which pathways or processes the chemical affects. This is not only interesting for yeast biologists, but also has important implications for human molecular biology and disease research. So a group at The Novartis Institutes of Biomedical Research decided to test the sensitivity of nearly 6,000 mutant yeast strains to a panel of about 1,800 compounds. 

Hoepfner and colleagues have published these results and have also generously offered them to SGD.  They used the HIP and HOP methods (HIP, HaploInsufficiency Profiling, using diploid heterozygous deletion mutant strains; HOP, HOmozygous deletion Profiling, using diploid homozygous deletion mutant strains) that have proven very useful in yeast since the creation of the systematic deletion mutant collections.

To do this mammoth series of experiments they obviously needed to set up an automated pipeline. These sorts of experiments have been done before, but in this study Hoepfner et al. improved on existing procedures in many ways: the physical techniques, the controls and replicates included, and the methods for data analysis.

Phenotype annotations in SGD. We’ve incorporated a subset of these results into SGD as mutant phenotype annotations. Why a subset? Some of the chemicals that were used in these experiments are un-named proprietary compounds, so the individual phenotypes would not be very informative in the context of SGD. We’ve added the phenotypes that involve named chemicals to SGD – more than 5,500 annotations. These may be viewed on Phenotype Details pages for individual genes (see example), retrieved as a set using Yeastmine, or downloaded along with all SGD mutant phenotype annotations in our phenotype data download file.

Easy access to the full dataset and analyses. We’ve also added a new set of links to SGD that take you directly from your favorite gene to the authors’ website, which provides full access to all of the data and interesting ways to look at it (see below). When you click on a “HIP HOP Profile” link from the Locus Summary page or the Phenotype Details page of a gene in SGD, the landing page at the authors’ website allows you to explore data for mutants in that gene or for chemicals affecting that mutant strain. You can see which chemicals had the greatest effects, which other mutant strains have a similar range of phenotypes, and much more. And if a chemical that has interesting effects is proprietary, don’t worry; Hoepfner and colleagues have stated that they “encourage future academic collaborations around individual compounds used in this study.”

Information about mutant strains. In the course of this study, the authors also generated some very useful data about particular mutant strains in the deletion collection. Some of them were hypersensitive to more than 100 different chemicals. Others turned out to be carrying additional background mutations that could affect the phenotypes of the mutant strain. We are planning to display this kind of information (from this and other studies) directly on SGD Phenotype Details pages in the future.

We thank Dominic Hoepfner and colleagues for sharing these data with SGD and for helping us to incorporate the data.  And we encourage you to explore this new resource and contact us with any questions or suggestions.

Links from SGD lead to multiple ways of exploring the full chemogenomics dataset.

Categories: New Data

Transcriptome Data in YeastMine

March 13, 2014

Towards the goal of compiling datasets to produce a complete transcriptome of yeast (the set of all RNA molecules produced in a single cell or population of cells), we have loaded a defined set of transcripts, based primarily on data from Pelechano, et al, but supported by other datasets, into SGD’s flexible search tool, YeastMine. The representative set includes transcripts which Pelechano et al. identified by simultaneous determination of the 5’ and 3’ ends of mRNA molecules whose end coordinates are supported by datasets from other laboratories.

The transcript data can be accessed in YeastMine using the ‘Gene -> Transcripts’ template, which allows you to specify a gene name or list of gene names and return the list of all associated transcripts based on the collection of data described above. The results include the start and end coordinates for each transcript, the number of counts observed for each transcript in glucose and galactose, notes, and references for the relevant datasets.

Categories: New Data

Human Disease & Fungal Homologs in YeastMine

March 04, 2014

You can now use SGD’s advanced search tool, YeastMine, to find the human homolog(s) of your favorite yeast gene and their corresponding disease associations. Or, begin with your favorite human gene or disease keyword and retrieve the yeast counterparts of the relevant gene(s). As an example, you can search for the S. cerevisiae homologs of all human genes associated with disorders that contain the keyword “diabetes” (view search).

We have recently loaded data from OMIM (Online Mendelian Inheritance in Man) into our fast, flexible search resource, YeastMine, and provided 3 predefined queries (templates) that make it simple to perform the above searches. Newly updated HomoloGene, Ensembl, TreeFam, and Panther data sets are used to define the homology between S. cerevisiae and human genes. The results table provides identifiers and standard names for the yeast and human genes, as well as OMIM gene and disease identifiers and names. As with other YeastMine templates, results can be saved as lists and analyzed further. You can also now create a list of human names and/or identifiers using the updated Create Lists feature that allows you to specify the organism representing the genes in your list. The query for yeast homologs can then be made against this list.

In addition to human disease homologs, we have incorporated fungal homolog data for 24 additional species of fungi. You can now query for the fungal homologs of a given S. cerevisiae gene using the template “Gene –> Fungal Homologs.” This fungal homology data comes from various sources including FungiDB, the Candida Gene Order Browser (CGOB), and PomBase, and the results link directly to the corresponding gene pages in the relevant databases, including Candida Genome Database (CGD) and Aspergillus Genome Database (AspGD).

All of the new templates that query human and fungal homolog data can be found on the YeastMine Home page under the new tab “Homology.” These templates complement the template “Gene → Non-Fungal and S. cerevisiae Homologs” that retrieves homologs of S. cerevisiae genes in human, rat, mouse, worm, fly, mosquito, and zebrafish.

Watch the Human Disease & Fungal Homologs in SGD’s YeastMine tutorial (below) to learn how to find and use these new templates.

Categories: Yeast and Human Disease New Data

Educational Resources on the SGD Community Wiki

February 21, 2014

Did you know you can find and contribute teaching and other educational resources to SGD? We have updated our Educational Resources page, found on the SGD Community Wiki. There are links to teaching resources such as classroom materials, courses, and fun sites, as well as pointers to books, dedicated learning sites, and tutorials that can help you learn more about basic genetics. Many thanks to Dr. Erin Strome and Dr. Bethany Bowling of Northern Kentucky University for being the first to contribute to this updated site by providing a series of Bioinformatics Project Modules designed to introduce undergraduates to using SGD and other bioinformatics resources.

We would like to encourage others to contribute additional teaching or general educational resources to this page. To do so, just request a wiki account by contacting us at the SGD Help desk – you will then be able to edit the SGD Community Wiki. If you prefer, we would also be happy to assist you directly with these edits.

Note that there are many other types of information you can add to the SGD Community Wiki, including information about your favorite genes, protocols, upcoming meetings, and job postings. The Community Wiki can be accessed from most SGD pages by clicking on “Community” on the main menu bar and selecting “Wiki.” The Educational Resources page is linked from the left menu bar under “Resources” from all the SGD Community Wiki pages. For more information on this newly updated page, please view the video below, “Educational Resources on the SGD Community Wiki.”

Categories: New Data Website changes

Tags: teaching , educational , Saccharomyces cerevisiae , genetics

New at SGD: GO Annotation Extension Data, Redesigned GO and Phenotype Pages

February 12, 2014

Annotation Extension data for select GO annotations are now available at SGD. The Annotation Extension field (also referred to as column 16 after its position in the gene_association file of GO annotations) was introduced by the Gene Ontology Consortium (GOC) to capture details such as substrates of a protein kinase, targets of regulators, or spatial/temporal aspects of processes. The information in this field serves to provide more biological context to the GO annotation. At SGD, these data are accessible for select GO annotations via the small blue ‘i’ icon on the newly redesigned GO Details pages. See, for example, the substrate information for MEK1 kinase (image below). Currently, a limited number of GO annotations contain data in this field because we have only recently begun to capture this information; more will be added in the future.

We have also redesigned the GO Details and Phenotype Details tab pages to make it easier to understand and make connections within the data. In addition to all of the annotations that were previously displayed, these pages now include graphical summaries, interactive network diagrams displaying relationships between genes and tables that can be sorted, filtered, or downloaded. In addition, SGD Paper pages, each focusing on a particular reference that has been curated in SGD, now show all of the various types of data that are derived from that paper in addition to the list of genes covered in the paper (example). These pages provide seamless access to other tools at SGD such as GO Term Finder, GO Slim Mapper, and YeastMine. Please explore all of these new features from your favorite Locus Summary page and send us your feedback.

Categories: New Data Website changes

More Regulation Data and Redesigned Tab Pages Now Available

November 26, 2013

Transcriptional regulation data are now available on new “Regulation” tab pages for virtually every yeast gene. We are collaborating with the YEASTRACT database to display regulation annotations curated both by SGD and by YEASTRACT on these new pages. Regulation annotations are each derived from a published reference, and include a transcriptional regulator, a target gene, the experimental method used to determine the regulatory relationship, and additional data such as the strain background or experimental conditions. The relationships between regulators and the target gene are also depicted in an interactive Network Visualization diagram. The Regulation tab for DNA-binding transcription factors (TFs) includes these items and additionally contains a Regulation Summary paragraph summarizing the regulatory role of that TF, a table listing its protein domains and motifs, DNA binding site information, a table of its regulatory target genes, and an enrichment of the GO Process terms to which its target genes are annotated (view an example). In the coming months we will be adding this extra information to the Regulation pages of other classes of TFs, such as those that act by binding other TFs.

We have also completely redesigned the web display of the Interactions and Literature tab pages, which now include graphical display of data, sortable tables, interactive visualizations, and more navigation options. These pages provide seamless access to other tools at SGD such as GO tools and YeastMine. Please feel free to explore all of these new features from your favorite Locus Summary page and send us your feedback.

Categories: New Data Website changes

Seminal Yeast Literature

August 27, 2013

SGD has compiled a selection of seminal yeast literature, comprising landmark papers in yeast biology. The list is available on the SGD Wiki and includes important publications on cell biology, early genetic maps and genome surveys, and the original S288C sequencing consortium. Also listed are key papers describing the genomes of other sequenced strains of S. cerevisiae.

This new page is just one of the many resources already available on the SGD Wiki, such as What are Yeast?, Protocols, and Job listings. We encourage you to add additional information to any of the SGD Wiki pages. If you don’t already have an SGD Wiki account, please contact the SGD Help Desk to request one.

Categories: New Data

New data tracks added to GBrowse

August 20, 2013

SGD has added new data tracks to the GBrowse genome viewer covering differential expression RNA-seq data from Waern & Snyder (2013) and Transcript Leader RNA-seq data from Arribere & Gilbert (2013).

Download data tracks, metadata and supplementary data by clicking on the ‘?’ icon on each data track within GBrowse or directly from the SGD Downloads site. We welcome new data submissions pre- or post-publication and invite authors to work with us to integrate their data into our GBrowse and PBrowse viewers. Please contact us if you are interested in participating or have questions and comments. Happy browsing!

Categories: New Data

Tags: RNA-seq , GBrowse , differential expression , TL-seq , TATL-seq

Regulation Information Integrated into SGD

July 18, 2013

The Locus Summary pages of 147 DNA-binding transcription factors (TFs; retrieve the list) now include a new tabbed page, Regulation. This page contains information on the regulatory targets of the TF, its binding sites, and its domains and motifs, as well as a free-text paragraph summarizing its biological context. Take a look at a brief video, below, that explains the different kinds of data found on the Regulation tab. In addition to viewing these data page by page, you can download them all using SGD’s data search and retrieval tool, YeastMine. Click on “Regulation” in the YeastMine menu bar to view the predefined templates for regulation data searches.


New Regulation Data at SGD from yeastgenome on Vimeo.

Categories: New Data Website changes

Links to LoQate

July 09, 2013

SGD now provides links to LoQate (the localization and quantitation atlas of the yeast proteome) from the Protein Information section of the Locus Summary Pages. The LoQate database provides localization and abundance data for 5300 yeast proteins at single-cell resolution under three different stress conditions: DTT, H2O2, and nitrogen starvation (Breker et al, 2013, J Cell Biol. 200(6), 839-850). Thanks to Maya Schuldiner for helping us set up the links.

Categories: New Data Website changes

YeastMine Upgrade

May 28, 2013

YeastMine, SGD’s powerful search and retrieval tool, has been upgraded to use InterMine version 1.1 software. Highlights of this release include a new format for the template results page, the addition of PantherDB and Homologene homolog data, an improved representation of Gene Ontology (GO) information, the ability to set background population within the GO enrichment widget, and an option to share lists with other users. In addition to the existing video tutorials, a new Help document describes some common queries. See an overview of these new features in the video below, New, Fun YeastMine 1.1!:

New, Fun YeastMine 1.1! from yeastgenome on Vimeo.

Categories: New Data Data updates

Sixty New Expression Analysis Datasets

March 05, 2013

Sixty new datasets have been added to our expression analysis tool at SGD, facilitating the rapid identification of co-expressed genes based on patterns of expression shared with query gene(s) across the entire collection. Expression data are now available at SGD from a comprehensive collection of 430 datasets representing 9190 microarrays from a total of 286 publications. The expression analysis tool can be accessed via the Expression tab and Expression Summary histogram located on Locus Summary pages, or using the ‘Expression’ option in the Function pulldown in the menu bar at the top of SGD pages. The new data will by default be included with the previous data when using the ‘New Search’, ‘Show Expression Levels’, or ‘Dataset Listing’ options. Alternatively, the new datasets can be specifically filtered using the dataset tag ‘not yet curated’. All of the RNA expression data are available for download in expression directory. Datasets are grouped by publication and are in PCL format.

Categories: New Data

Links to DRYGIN added to SGD Locus Summary and Interactions pages

June 23, 2012

SGD now provides links from both the Locus Summary and Interactions pages for each S. cerevisiae ORF to DRYGIN (Data Repository of Yeast Genetic Interactions), a database of quantitative genetic interactions of S. cerevisiae (Koh et al., 2010). These genetic interactions were determined from SGA double-mutant arrays conducted in Charles Boone’s laboratory at the University of Toronto, and include both published data (Costanzo et al., 2010) and new interactions released by the Boone laboratory as they become available. Clicking on a DRYGIN link in SGD from an ORF’s Locus Summary or Interactions page goes directly to the DRYGIN search results page for that ORF, which lists both positive and negative genetic interactions as well as any genetic correlations for the given ORF.

Categories: New Data

Tags: DRYGIN , genetic interactions , SGA array

New data tracks added to GBrowse

April 23, 2012

SGD has added a new mix of data tracks to our GBrowse genome viewer from seven publications covering transcriptome exploration via tiling microarrays (David et al. 2006), genomic occupancy of RNA polymerase II and III and associated factors (Kim et al. 2010; Ghavi-Helm 2008), 3′ end processing (Johnson et al. 2011), histone H2BK123 monoubiquitination (Schulze et al. 2011) and high-resolution ChIP by a novel method called ChIP-exo (Rhee et al. 2011; Rhee et al. 2012). Download data tracks, metadata and supplementary data by clicking on the ‘?’ icon on each data track within GBrowse or directly from the SGD downloads page. We welcome new data submissions pre- or post-publication and invite authors to work with us to integrate their data into our GBrowse and PBrowse viewers. Please contact us if you are interested in participating or have questions and comments. Happy browsing!

Categories: New Data

Tags: histone modifications , RNA polymerase II , RNA polymerase III , transcriptome , ChIP-exo

Expression Data and LiftOver Files Available for Download

February 14, 2012

RNA expression data that are included in SGD’s SPELL expression analysis tool are now available for download in the expression directory. Datasets have been grouped by publication and are in PCL format.

LiftOver files that allow conversion of chromosomal coordinates between different S. cerevisiae genome versions are also now available for download via the genome_releases link in the sequence directory.

Categories: New Data Website changes

New data tracks added to GBrowse

January 26, 2012

SGD has added a mélange of data tracks to our GBrowse genome viewer from six publications covering various applications of high-throughput sequencing, including genome-wide distributions of DNase I-protected genomic footprints (Hesselberth et al. 2009), recombination-associated double strand breakpoints (Pan et al. 2011), polyadenylation sites (Ozsolak et al. 2010), antisense ncRNAs (Yassour et al. 2010), cryptic unstable transcripts (CUTs) (Neil et al. 2009) and Xrn1-sensitive unstable transcripts (XUTs) (van Dijk et al. 2011). You can now also easily download data tracks, metadata and supplementary data by clicking on the ‘?’ icon on each data track within GBrowse. Please watch our video tutorial for more information on how to download data from GBrowse. We welcome new data submissions pre- or post-publication and invite authors to work with us to integrate their data into our GBrowse and PBrowse viewers. Please contact us if you are interested in participating or have questions and comments. Happy browsing!

View Downloading GBrowse Data at SGD on Vimeo.

Categories: Tutorial New Data

Updated Resource: YPL+

January 25, 2012

Links to YPL+ (the Yeast Protein LocalizationPlus Database) have been added to the “Protein Information” section of SGD Locus Summary pages. YPL+ is a recently upgraded version of the YPL image database, and has been expanded to include GFP-localization data for more than 3500 genes. Data in YPL+ are derived from a collection of GFP fusion constructs generated by C-terminal chromosomal tagging (Huh et al., 2003, Nature 425, 686-691) as well as a collection of proteins involved in lipid-metabolism, constructed by in vivo recombination (Natter et al., 2005, Mol. Cell. Proteomics 4(5), 662-672). Thanks to Sepp Kohlwein for help in setting up these links.

Categories: New Data Website changes

New Histone Modification and Variant Data Tracks Added to GBrowse

August 05, 2011

We have added new data tracks to our GBrowse genome viewer from six publications covering various histone acetylation and methlyation modifications (Guillemette et al. 2011; Kirmizis et al. 2007; Liu et al. 2005 and Pokholok et al. 2005) and the mapping of histone variant H2A.Z (Albert et al. 2007 and Guillemette et al. 2005). We welcome new data submissions pre- or post-publication and invite authors to work with us to integrate their data into our GBrowse and PBrowse viewers. Please contact us if you are interested in participating or have questions and comments.

Categories: New Data