Data Format

HS3D (Homo Sapiens Splice Sites Dataset) is a data set of Homo Sapiens Exon, Intron and Splice regions extracted from GenBank primate sequences Rel.123. 

HS3D is available in "Downloads" section of this site.

EI_true.seq description:
gttrue.jpg (11832 byte)
IE_true.seq description: 

agtrue.jpg (11878 byte)

EI_false.seq description:

gtfalse.jpg (11318 byte)

IE_false.seq description:

agfalse.jpg (11754 byte)

Exons.seq example :

Locus : AB002059
Exon N.: 1
Start : 9106
End : 9239

ATGGGCTCCCCAGGG
GCTACGACAGGCTGGGGGCTTCTGGATTATAAGACGGAGAAGTATGTGATGACCAGGAAC
TGGCGGGTGGGCGCCCTGCAGAGGCTGCTGCAGTTTGGGATCGTGGTCTATGTGGTAGG

End : Found
Overall nucleotides : 134
G+C content : 59.7014925373134%
Nucleotide scan : Verified
#

Introns.seq example :

Locus : AB002059
Intron N.: 4
Start : 16651
End : 16840

GTAACTGTGGGCTCTGTCTTCCAGTGCCCC
CAGCAGGGTGGGGGCCGGGCTGGGATCCTGGGTGGCTCCTGAGTGCAGGCCCTGCTCGCC
TCTGTCCCTGCATCTCTCTTTCTGCCAACAACCCCCTGGCTGAAGGCCTCCCCAGGCCTG
CAGAGATTTGAAGGTCTGGAGTTCATCTTTTGTTTTCTAG

End : Found
GT: OK
AG: OK
Overall nucleotides : 190
G+C content : 60.5263157894737%
Nucleotide scan : Verified
#

            ^ top of the document ^