On-Line Help: BLAST

BLAST represents a series of on-line computer programs designed to analyze DNA and protein sequence data. BLAST retrieves sequence information from numerous on-line sources including the GenBank and Swiss-Prot databases. Using a complex algorithm (computer program), BLAST compares a submitted peptide (protein) or nucleotide (DNA) sequence to all of the other peptide and nucleotide sequences that are recorded in the on-line databases. BLAST provides information that tells the user how similar the submitted (queried) sequence is to all of the other sequences that the BLAST program has access to.

The BLAST algorithm is based upon the statistical methods of Karlin and Altschul (1990, 1993). There are several sites on the web where you can access and use the BLAST programs. We recommend the National Center for Biotechnology Information page. The Waksman Student Scholars Program is in no way responsible for the the BLAST resource.

Upon entering the Blast input form, you will have to set three options that will allow Blast to correctly analyze your data. First, you will have to select which Blast program you wish to use to analyze your data. This is done through a pull-down menu labeled "Program." Click on the menu and choose either blastp (to analyze protein sequences) or blastn (to analyze nucleotide sequences).

Second, you must choose a database that the Blast program can use as an information source. This function is controlled by a pull-down menu labeled "database." For most purposes, it is best to simply leave this list at the default "nr" option. "Nr" will allow Blast to analyze "All non-redundant GenBank CDS translations+PDB+SwissProt+PIR."

Third, you must tell Blast whether you want your output sent to you by e-mail or displayed in your web browser as HTML. This is accomplished by selecting the check box labeled either "Send reply to the E-mail address: " or In HTML format . If you elect to have the data sent to you via e-mail, you must type your e-mail address into the nearby input box.

Once these options are set, you can submit your data to Blast for analysis.

Entering information to the BLAST text form for analysis is very simple. DNA sequence data (found in the "origin" field of Genbank entries) can be pasted directly into the text input box labeled "Enter here sequence in FASTA format: ". Protein sequences (single-letter notation) from GenBank, SwissProt, and others can also be simply pasted into this field. The program will ignore spaces and line numbers. After you have entered the sequence that you want to analyse, click on the button labeled "Submit Query"

Try it; copy this protein sequence to your clipboard:

GDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKT GQAPGYSYTA ANKNKGIIWGEDTLMEYLEN PKKYIPGTKM IFVGIKKKEE RADLIAYLKK ATNE

Try to analyse this sequence using the "Blastp" program.

The sequence above was obtained by searching the SwissProt database (Note: the SwissProt database also gives the user the opportunity to submit a protein sequence to the BLAST program directly-see information on the SwissProt database).

BLAST provides the user with two types of data. The first is a list of known peptide or nucleotide sequences that most closely match the one that was submitted to the program for analysis. This list is ranked in descending order of relatedness, with the best match at the top of the list. The second data type describes exactly how well each sequence on the list compares with the queried sequence by lining up and comparing each individual nucleotide base or amino acid residue.

BLAST provides output in the form of an HTML page. Optionally, you may elect to have the output mailed to you via e-mail.


BLAST input form


Return to Research