From: Genomic benchmarks: a collection of datasets for genomic sequence classification
Name | # of sequences | # of classes | Class ratio | Median length | Standard deviation |
---|---|---|---|---|---|
dummy_mouse_enhancers_ensembl | 1210 | 2 | 1.0 | 2381 | 984.4 |
demo_coding_vs_intergenomic_seqs | 100000 | 2 | 1.0 | 200 | 0.0 |
demo_human_or_worm | 100000 | 2 | 1.0 | 200 | 0.0 |
drosophila_enhancers_stark | 6914 | 2 | 1.0 | 2142 | 285.5 |
human_enhancers_cohn | 27791 | 2 | 1.0 | 500 | 0.0 |
human_enhancers_ensembl | 154842 | 2 | 1.0 | 269 | 122.6 |
human_ensembl_regulatory | 289061 | 3 | 1.2 | 401 | 184.3 |
human_nontata_promoters | 36131 | 2 | 1.2 | 251 | 0.0 |
human_ocr_ensembl | 174756 | 2 | 1.0 | 315 | 108.1 |