Displaying Your Own Columns in the UCSC Gene Sorter
 

The Gene Sorter provides dozens of columns containing information on genes computed at UCSC or provided by outside collaborators. In addition to these standard columns, users may also upload their own columns for temporary display in the browser. Custom columns are viewable only on the machine from which they were uploaded and are kept only for 8 hours after the last time they were accessed. Optionally, users can make custom columns viewable by others as well.

Gene Sorter custom columns are based on files in line-oriented format. Each column is described by an initial column line followed by one or more data lines. The column line describes the name, hyperlinks, and other overall characteristics of the column. Each data line contains specific information about a gene annotated by the column. Lines starting with # are ignored. Only one column file may be loaded at a time; however, multiple column descriptions may be included in the same custom file, separated by blank lines.



  The Column Line
 

Each column description must begin with a column line containing the keyword column followed by an optional set of one or more attribute pairs:

column [attribute1]=[value1] [attribute2]=[value2]...
Attribute values must be enclosed in quotes if they contain spaces or tabs. Attribute names and data values are case-sensitive. The following attributes may be defined:

  • name - Symbolic name of the custom column (not displayed to user).
  • shortLabel - Label displayed at the top of the column in the Gene Sorter display. The default value is User Column.
  • longLabel - Short description of the column displayed after the name on the configuration and filter pages. The default value is User custom column.
  • visibility - Controls whether column is displayed by default: on = display column, off = hide column. The default is on.
  • priority - Specifies the display order of the column relative to others. Columns with lower priority values appear toward the lefthand side of the display. The standard browser columns have priorities between 0 and 20. The default priority is 2.01.
  • itemUrl - URL used to construct hyperlinks accessed by clicking on column data values. If the URL contains %s, the column value will be inserted at that position in the hyperlink string. For example, if itemUrl for a column is defined as http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg16& position=%s (the UCSC Genome Browser URL), clicking on the data value NM_024014 will open the Human Jul. 2003 Genome Browser to the position occupied by RefSeq accession NM_024014. There is no default for this attribute.
  • labelUrl - URL of the hyperlink accessed by clicking on the column label. No default.
  • search - When the atribute is set to one of the following values, column data may be searched using the Gene Sorter search text box. Rows containing matches will be moved to the top of the display.
    • exact - matches only if the text entered in the position search box exactly matches the column data text.
    • prefix - matches if search text exactly matches the initial part of the column data text.
    • fuzzy - matches if search text matches any portion of the column data text.
    By default, no search criterion is set.
  • idLookup - When set, this attribute specifies the standard column that should be used to link key values to the Gene Sorter display. For example, if idLookup is set to refSeq, a custom column data row containing the key NM_024014 will display on the same row as the RefSeq row containing NM_024014. The idLookup values are case-sensitive. By default, idLookup is set to the acc (GenBank) column. To determine the idLookup value that corresponds to a specific standard column, click the column's title in the Gene Sorter display (use the configure button to turn on the column display if it is currently hidden). The near.do.colInfo parameter in the URL linked to the column title is set to the idLookup value that corresponds to that column.
  • isNumber - When this attribute is set to on, the filter page displays numerical max/min controls for this column. Default is off.


  Data Lines
 

Data lines are of the format:

[key] [value] 
  • key - links the custom column data value to a data value in the column specified by the idLookup attribute. If idLookup is unset, the browser looks for a match in the acc (GenBank) column.
  • value - data value to be displayed in the custom column row that matches the specified key. It is permissible to have more than one key/value pair per key. In this case, the column displays a comma-separated list of values.

Data line keys and values are case-sensitive.



  Examples
 

Example 1

This example defines a custom column for the Jul. 2003 Gene Sorter. The column's key values are linked to data in the refSeq column. Column rows can be filtered by numerical range by setting the max/min values for the column on the filter page.

#Custom column file for MyLab Trial 3
#
#Column line:
#
column name="MyLab Data" shortLabel="MyLab" longLabel="MyLab Trial 3" 
visibility=on priority=2.05 idLookup=refSeq isNumber=on
#
#Data lines (key links to refSeq column):
#
NM_005523    1.2
NM_005522    4.5
NM_018951    5.1
NM_000522    5.7
NM_030661    9.4
NM_002141    5.2
NM_024014    4.3
NM_006896    6.0

Example 2

This example defines a custom column for the Oct. 2003 mouse Gene Sorter. The column's key values are linked by default to the acc (GenBank) column. Clicking on the column's title (UCSCLab) displays the web page http://genome.ucsc.edu/. Clicking on a specific data value displays the web page http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm4 (the UCSC Genome Browser) at the position specified by the data value. A search on the word MOUSE will display a list of all UCSC BioLab data that contains the string "MOUSE".

#Custom column file for UCSC BioLab Test Data 
#
#Column line:
#
column name="UCSCLab Data" shortLabel="UCSCLab" longLabel="UCSC BioLab Test Data 4/4/04" 
visibility=on priority=2.05 
itemUrl=http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm4&position=%s 
labelUrl=http://genome.ucsc.edu search=fuzzy 
#
#Data lines (key links to refSeq column):
#
U20370      HXAB_MOUSE
L08757      HXAA_MOUSE
NM_008264   HXAD_MOUSE
# The following 2 lines demonstrate multiple data values for 1 key:
AK083575    NM_010449
AK083575    Q8BNI8
M95599      NM_010451
M28021      HXA5_MOUSE
NM_010454   HXA6_MOUSE