Variant Reflector

Variant Reflector takes a VCF file as input and generates a "variant-reflected cDNA sequences" in a FASTA file and a "nucleotide / protein mutation list" in a TSV format for each sample in the VCF file.


VCF (bgzip compressed)

Please input gz compressed vcf (vcf.gz) using bgzip (or make vcf.gz using gatk option etc...). The number of samples in one VCF file should be limited to 10.

Genome/GFF

Choose genome and gff set from choices.

Project ID

Set Project ID written on output directory.

E-mail

Set E-mail address to send download link.


Terms of Service

Data is collected on an anonymized basis and may be used to improve web services and for academic and medical research on cats. Please see the Terms of Service for more information.


To submit, you must agree to these Terms of Service.


Example:

Input file:

・VCF format containing only SNPs and short INDELs.

・Genotype (GT) must be in the FORMAT field.

・Maximum number of samples is 10. To reduce the number of samples in the VCF, please use extract-samples-from-vcf on ANCAT(registration required).

      ##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=20210614
##source=PLINKv1.90
##contig=<ID=AnAms1.0_A1,length=243475044>
##contig=<ID=AnAms1.0_A2,length=174763049>
##contig=<ID=AnAms1.0_A3,length=141440765>
##contig=<ID=AnAms1.0_B1,length=208847529>
##contig=<ID=AnAms1.0_B2,length=167835276>
##contig=<ID=AnAms1.0_B3,length=151101689>
##contig=<ID=AnAms1.0_B4,length=144639378>
##contig=<ID=AnAms1.0_C1,length=223514613>
##contig=<ID=AnAms1.0_C2,length=161461353>
##contig=<ID=AnAms1.0_D1,length=119525305>
##contig=<ID=AnAms1.0_D2,length=90643618>
##contig=<ID=AnAms1.0_D3,length=106503664>
##contig=<ID=AnAms1.0_D4,length=96894663>
##contig=<ID=AnAms1.0_E1,length=65432868>
##contig=<ID=AnAms1.0_E2,length=64410616>
##contig=<ID=AnAms1.0_E3,length=41741383>
##contig=<ID=AnAms1.0_F1,length=74988956>
##contig=<ID=AnAms1.0_F2,length=86115244>
##contig=<ID=AnAms1.0_X,length=129624529>
##contig=<ID=AnAms1.0_unplaced,length=1328>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##bcftools_viewVersion=1.13+htslib-1.13
##bcftools_viewCommand=view -R ../reference/AnAmsCDS.bed.gz -O v -o selected_ext2samples.vcf.gz AnAms_1M_ext2samples.vcf.gz; Date=Sat Jul 17 12:51:24 2021
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT MA02_coco_AnAms_MA02_coco_AnAms PG3483_49_neko1_AnAms_PG3483_49_neko1_AnAms
AnAms1.0_A1 585711 . G A . . PR GT 0/0 0/0
AnAms1.0_A1 585776 . T C . . PR GT 0/0 0/0
AnAms1.0_A1 621926 . G A . . PR GT 0/0 0/0
AnAms1.0_A1 621955 . T C . . PR GT 0/0 0/0
...

Output files:

- process.txt

・Describing the process

・If all of the process is successful, the last is 'Analysis complete.'

      

Start Analysis

Parse GFF : start

Parse GFF : finish

Parse VCF : start

Parse VCF : finish

Get CDSs : start

Get CDSs : finish

Reflect mutations : start

Reflect mutations : finish

Analysis completed

- [sample]_genes_broken.txt
List of genes whose variants disrupted ORF (Please note if there are no genes with broken ORF, this file is blank.)
- [sample]_mutated.fasta
cDNA of genes with variants (FASTA format)
      >MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000020.1
ATGTGA
>MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000030.1
ATGTGTGGGCGCTTCTCCGGCTGGGGGACGTCCTCCTGCGTGTCACTCTCAGGCGGGCGCAGCCGGCCCGGTGTTGACCGCCGCGTGGGCGCCCCGACGGGCGGAGGGAGAGGGAAGACGAGCGGGGAGCCACACCATTATGGACTCACAGGAACTGGATTTATTCTCTTGTCTGAAACAATAAAACAAAACCACACAAGATACACAAAATGCAGATTTCCAAGATACTGGGCATCAGGCAAAAGAGGACAGCAATTCCACACTTGGAAACACACAGAATGTTCAACAGGTAAATGGAAAATGTGA
>MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000040.1
ATGAGAATGCTGCTGTGGCTTCCTGTCTTGCTGGGTCTTGGGAGCTGGAGTGCTTTCTCCTGGAATGAAACACAAGCCAAACGCATATCAGAGGGCCTTCAGGACCTGTTTGGCAACATCTCTCAGCTTATTGATAAAGGAAGACTTGGTCTCAATGTGGTCTCTCACAAGGAGTGGGGGGCAGAAGCTGTTGGCTGCAGCACTCCACTGACCAGGCCTGTGGATTTCTTTGTCACGCACCATGTCCCTGGACTGGAGTGTCACAACCGGACTGCATGTAGTCAGAGGCTGTGGGAACTCTGGGACCATCATGTGCACAACAACAGCTGGTGTGACGTGGCCTACAACTTCCTGGTTGGAGACGATGGCAGGGTGTATGAAGGTGTTGGCTGGAACATCCAAGGCATGCACACCCAGGGCTACAACCACATCTCCCTGGGCTTTGCTTTCTTTGGCACCAAGGAAGGCCACAGTCCTAGC
- [sample]_variants_annotated.tsv
Variants and their effects on protein
    

chrom pos ref alt gene sample_gene cds_effect protein_effect genotype

AnAms1.0_A1 624776 C T AnAmsBeta_A1000030.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000030.1 c.164C>T p.T55I 0/1

AnAms1.0_A1 935861 G A AnAmsBeta_A1000100.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000100.1 c.308G>A p.R103Q 0/1

AnAms1.0_A1 1113659 C T AnAmsBeta_A1000150.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000150.1 c.3205G>A p.D1069N 0/1

AnAms1.0_A1 1113659 C T AnAmsBeta_A1000160.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000160.1 c.3448G>A p.D1150N 0/1

AnAms1.0_A1 1159724 G A AnAmsBeta_A1000170.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000170.1 c.407G>A p.R136H 0/1

AnAms1.0_A1 1159864 G A AnAmsBeta_A1000170.1 MA02_coco_AnAms_MA02_coco_AnAms_AnAmsBeta_A1000170.1 c.547G>A p.G183R 0/1

...

- diff_ref_allele.tsv
Describing the difference your VCF and reference genome
    

#CHROM POS REF_VCF REF_FASTA

AnAms1.0_A1 708993 G A

AnAms1.0_A1 937344 A G

AnAms1.0_A1 1144781 C T

...

- input_files.txt
List of input files
      

reference fasta : AnAms1.0.genome.fa.gz

gff : AnAmsBeta.gff.gz

vcf : selected_ext2samples.vcf.gz


Please cite our paper (DOI: 10.1101/2020.05.19.103788)

This analysis has been done using ANCAT. If you are writing a paper using the results, you do not need to cite ANCAT itself, but it would be appreciated if you could add ANCAT to the acknowledgements of your paper.