• Idea was conceived in 2009, motivated by several whole-genome sequencing paper and whole-exome sequencing paper.

  • On 2010Feb15, first public release of ANNOVAR.

  • On 2010Mar07, new release (subversion 322) fixed -regionanno issues.

  • On 2010Mar27, major updated release is uploaded.

  • On 2010Mar30, updated the auto_annovar script and improved ANNOVAR memory management so that it runs in environment with limited memory.

  • On 2010Jun02, the functionality of ANNOVAR is greatly improved, and now includes an optional step to implement SIFT-based annotation of non-synonymous SNPs (that is, predict whether non-synonymous SNPs are detrimental or tolerated), as well as the ability to examine GFF3 databases.

  • On 2010Jun06, several bugs have been fixed, and the convert2annovar.pl program has been added. ANNOVAR can now handle March 2010 release of the 1000 Genomes Project data.

  • On 2010Jun30, several functions were enhanced and bugs were fixed. This version fixed a problem downloading Ensembl annotations, added the functionality to handle VCF file as annotation database directly, improved the functionality of -downdb operation, fixed gene-based annotation issues due to errors in the FASTA files provided by UCSC. An update is also provided for convert2annovar.pl. This fixed an issue when handling pileup format files with indels.

  • On 2010Aug06, added the summarize_annovar.pl program to convert whole-genome variants data into an Excel file that users can examine using Excel "filter" functions to identify causal mutations. Major changes to the retrieve_seq_from_fasta.pl file such that it can handle several different types of input files, and that it knows how to handle whole-genome sequence files for several irregularly formatted model organisms (such as chimp), and that is produce FASTA records with time stamps. Several known minor bug fixes for the annotate_variation.pl program are also implemented. convert2annovar.pl can now handle MAQ genotype calling output files.

  • On 2010Sep29, minor bug fixes and function enhancement. convert2annovar.pl can now handle CAVASA and VCF version 4 genotype call files, but these functionalities are not mature yet and are being rigorously tested.

  • On 2010Dec02, added support for defining custom precedence in gene-based annotation, changed defult precedence as exonic=splicing > ncrna > utr5=utr3 > intronic > upstream=downstream > intergenic; fixed bugs in annotating intronic variants between two UTR-exons as UTR-variants; fixed bugs in reporting amino acid change for reverse strand insertions; added support for 1000G hg19 coordinate (Nov 2010 release); added support for SIFT hg19 coordinate; changed exonic variant annotation (adding cDNA level annotations to amino acid annotations) per user requests; fixed bugs in handling lower-case letters in --genericdbfile.

  • On 2011Jan17, added -colsWanted argument for users to choose the desired output column in DB file, added chrX data to 1000G Nov 2011 data set (use -downdb to re-download the data set), updated gene definition and FASTA file for human and mouse, changed filter operation to handle SNPs with 3 or 4 alleles annotated in dbSNP, changed "stop lost" to "stop loss" in exonic annotation, fixed a bug in summarize_annovar.pl in handling older 1000G files, fixed a bug in convert2annovar.pl in handling insertions for VCF4 files, changed default 1000G file as 2010jul for hg18 in summarize_annovar.pl.

  • On 2011Jan31, fixed the "counts cannot be inferred" issue in convert2annovar.pl, more informative conversion for SamTools pileup file in convert2annovar.pl, added ability to handle the newer version of SOLiD GFF file in convert2annovar.pl, added protein level annotation for exonic deletion, fixed the bug in handling negative strand in dbSNP records. On 2011Jan31 3PM PST, a small bug was discovered and the package was re-uploaded.

  • On 2011Feb11, fixed a bug that was introducted in the 2011-01-31 version to handle dbSNP filtering.

  • On 2011Feb20, changed convert2annovar.pl for more informative handling of pileup files and VCF4 files, changed exonic annotation for frameshift stopgain/stoploss mutations by printing amino acids before stop codon, changed "database annotation error" warning (due to for example co-existence of chr6 and chr6_cox_hap1), ANNOVAR now only examine the first occurence of a transcript, if the transcript is mapped to multiple locations with discordant sequence length, added functionality to perform gene-based annotations using GENCODE or other gene annotation systems, region-based annotation no longer prints Score=0 in the second column, changed output file name for region-based annotation using mceXway, tfbs, band, segdup keywords, fixed a bug in filter-based annotation for block substitution on single nucleotide, retrieve_seq_from_fasta.pl: added warning message to sequence that occur multiple times with discordant lengths, retrieve_seq_from_fasta.pl: no longer process 'alternative haplotype' chromosomes such as chr6_cox_hap1 by default, fixed a bug in having negative values in cDNA positions when annotating long indels, fixed the bug in not printing out normalized scores when annotating phastCons regions. (Note that a small issue was found after uploading, so an updated file was uploaded on 2011Feb22).

  • On 2011May06, fixed the problem downloading bosTau4 sequence for cow genome, fixed the -separate argument that print line column twice in exonic annotation, the ./. genotype in VCF file is annotated as "unknown" in updated convert2annovar.pl, fixed a bug in retrieve_seq_from_db.pl in handling ENSEMBL gene for yeast, added -exonsort argument to sort exon number in output line for gene-based annotation, replaced Em: to Em. for very rare scenarios where UCSC Gene name is prefixed with Em:, fixed auto_annovar bug in handling wrong mce file name due to changes in annotate_variation.pl, fixed problem on handling snp132 files due to different file format, updated convert2annovar.pl to enhance functionality to handle VCF files, updated summarize_annovar.pl to incorporate additional scoring methods in Excel output, added ljb scoring system in filter-based annotation

  • On 2011Jun18, improved the annotation of splicing variants, added -reverse argument to better control -score_threshold argument, added coding_change.pl program to print out protein sequence before and after mutation, added -exonsort argument to annotate_variation.pl to make results stable, added -bedfile argument for region based annotation using BED files as database, fixed a bug in processing VCF files in annotate_variation.pl directly, fixed issues in convert2annovar.pl to handle zygosity status in mpileup file generated by Samtools, added functions to process BED file directly in region annotation

  • On 2011Sep11, significant speedup of filter operation for certain databases (dbSNP, SIFT, PolyPhen, etc), added warning message if user inputs wrong reference allele for exonic mutations, added exon number to splicing annotation in gene-based annotation, changed ncRNA to ncRNA_exon and ncRNA_intron in gene-based annotation, added support for cg69 (complete genomics) database and GERP++ database

  • On 2011Oct02, fixed the cDNA off-by-one error for splicing annotation for acceptor site splicing variants, fixed bug in summarize_annovar.pl when -step argument is used, ANNOVAR now prints out examples when exonic SNPs have WRONG reference alleles specified in your input file, fixed the bug on indexing-based filter search on dbSNP (indexing-based search now requires '-webfrom annovar' when -downdb is used), fixed certain ncRNA annotation errors (such as ncRNA_UTR5, ncRNA_exonic) when the variant hits both coding and noncoding gene, fixed the bug to annotate ncRNA_exonic with exonic_variant_function, only coding transcripts will be used in gene-based annotation if a gene has coding and noncoding transcripts

  • On 2011Nov20, mRNA FASTA sequences without complete ORF annotation will no longer be used in exonic annotation, fixed the bug in specifying ensgene in command line in auto_annovar and summarize_annovar, fixed the problem in handling dbSNP132 in hg19 coordinate, slightly changed the "exonic SNPs have WRONG reference alleles" warning message to be more clear, retrieve_seq_from_fasta.pl now reports transcripts whose ORF have premature stop codon, fixed the hg18_cg69 and hg19_cg69 allele frequency error, convert2annovar.pl supports GFF3 files generated by 5500SOLiD and the LifeScope software

  • On 2012Feb23, added esp5400_ea, esp5400_aa, esp5400_all keywords for allele frequencies in 5400 exomes, added ljb_sift, ljb_gerp++, ljb_all databases for faster/easier retrieval of whole-exome functional scores, updated mRNA sequence files for hg18 and hg19 gene definitions, all custom databases have newer/faster index and default -indexfilter argument is now 0.9, add -otherinfo argument for -filter operation to print additional information in annotation, slight changes to convert2annovar.pl to better handle CASAVA files, fixed the problem in handling UCSC genes whose names contain space fixed the bug that -reverse does not work for "-dbtype avsift" other minor bug fixes

  • On 2012Mar08, added ability to handle 1000G 2012feb version, fixed bug in -allallele argument in convert2annovar.pl when handling more than two alternative alleles in VCF files, slight change to handle latest knowngene annotation due to format change of kgXref file, -verbose now print out noncoding transcripts that are ignored in analysis in gene-based annotation

  • On 2012May25, -downdb works for 1000g2012apr now, mutations in beginning or end of transcript are no longer reported as splicing variants, added -seq_padding argument to pad flanking amino-acid sequence around indels, added -indel_splicing_threshold argument to better annotate splicing variants around indels, the -colWanted argument now works on BED database files, fixed problem with -colWanted argument if the desired column contains comma, minor fix to region annotation when the region in database itself is a zero-length insertion, fixed the bug in complaining "wholegene," in annotation output, enhanced handling of errors in VCF4 files (such as presence of N in alleles) in convert2annvar.pl, added -infoasscore argument for printing entire INFO field in VCF database files

  • On 2012Oct23, added -veresp argument to summarize_annovar.pl to suppert esp6500 data set, added -aamatrixfile argument to print out amino acid substitution scores such as Gratham scores, changed UCSC download from FTP to HTTP to help users with firewall settings, fixed a problem handling genericdb file when chr prefix is present for chromosomes, fixed a problem downloading index for gerp++gt2 files, added variants_reduction.pl program

  • On 2013Feb11, mitochondria genome is now supported, the -zerostart argument is no longer supported, better handling of GFF3 files with undefined scores, added -gff3attr argument so that attribute field from GFF3 file can be printed in output, changed summarize_annovar.pl to take -alltranscript argument to print out all isoforms for exonic variants, summarize_annovar.pl now takes esp6500si and snp137NonFlagged as databases, exonic variant near intron/exon boundary are no longer reported as splicing, unless -exonicsplicing is set, fixed a minor issue in finding tar program in BSD-derived operating system, convert2annovar.pl now handles *.gz file or handles stdin as input file name, convert2annovar.pl accepts -comment argument to keep comment lines in VCF4 file in output.

  • On 2013Feb21, fixed a bug that exonic variants at exon end were annotated as splicing when -exonicsplicing is not set.

  • On 2013May09, fixed a bug in line count of exonic_variant_function when handling more than 5 million variants, table_annovar.pl is implemented to replace summarize_annovar.pl changed -downdb behavior on 1000G data sets, convert2annovar.pl now handles soap format with 17 fields, corrected some typos in help message, fixed a bug that exonic variants at exon end were annotated as splicing when -exonicsplicing is not set

  • On 2013Jun21, fixed a bug in table_annovar to have empty output when input is in five-column format, fixed a bug in table_annovar fo avsift output, fixed a bug when handling start position for multi-allelic SNPs in dbSNP, fixed a bug when scanning indels with multi-allelic variants in VCF DB file, fixed a bug when chr prefix is present in filter database, fixed a bug in annotate_variation.pl to report position of coding insertions in negative strand, small change to retrieve_seq_from_fasta so it handles zebrafish correctly

  • On 2013Jul28, much improved VCF conversion function in convert2annovar.pl, improved functionality of table_annovar including support for ljb2* databases, new databases such as nci60 and popfreq_all are now supported, disabled -sortout in table_annovar due to many bugs, updated Gratham matrix due to inconsistencies with original publication

  • On 2013Aug23, convert2annovar.pl no longer complains when VCF file does not have a valid header, fixed a small bug in convert2annovar.pl to handle certain classes of indels, table_annovar now works on non-human species, minor fix in annovar to handle certain mouse mutations, ccdsGene annotation uses transcript ID as gene name due to lack of gene name in previous versions, implement dup keyword in exonic variant annotation to better conform to HGVS standards

  • On 2014Jul14, table_annovar now supports -tempdir argument, table_annovar now supports VCF input format and write to output VCF file, table_annovar now use separate column for splicing/UTR notations, convert2annovar can generate all possible SNVs/indels in a genomic region or in a transcript, convert2annovar can generate ANNOVAR input files for list of dbSNP identifiers, improve convert2annovar to better handle block indel/substitution in VCF4, changed 'stopgain SNV' and 'stoploss SNV' to 'stopgain' and 'stoploss' as they apply to both SNVs and indels, add -withfilter argument to convert2annovar to print out FILTER field in output for VCF files, fixed bug to handle VCFdbfile when AF record is in scientific notation, add UTR cdot annotation, improved table_annovar to print out correct column headers when the database file has the information, fixed a convert2annovar bug when VCF file does not have valid VCF header but -includeinfo is specified, add details for splicing variant when -separate flag is used, minor bug fix for dup annotation for insertions in negative strand, minor change to default parameters in table_annovar.pl for ljb2_pp2, fixed an error in convert2annovar to handle multi-sample VCF files

  • On 2014Nov12, fixed a problem of convert2annovar with some NR records in VCF file, fixed a bug in handling dup variants in coding_change.pl, improved the ability to handle VCF file in able_annovar.pl, significantly reduce memory usage for filter annotation, improve compatibility for unconventional chromosome names for species such as tomato, fixed a problem in annotating against multi-allelic indels in VCF file, fixed a problem in exon numbering for splice variants in reverse strand, refGene version file included in -downdb

  • On 2015Mar22, fix a bug with -score_threshold does not work with ljb2* databases, added -poscolumn argument to fine-tune region-based annotation, convert2annovar.pl can now handle PATH automatically and better predicts depth coverage in VCF file, handle situations where a variant disrupt both ncRNA_exonic and splice site in gene-based annotation, slight change in retrieve_seq_from_fasta.pl to increase compatibility for certain species, improve table_annovar.pl to print out all columns for exac03 and other databases, fixed a bug in coding_change.pl to handle c.dup identifiers

  • On 2015Jun16, improve convert2annovar to handle CASAVA format better, enable convert2annovar to handle ANNOVAR to VCF conversion for specific input files, improve backward compatibility of table_annovar.pl for ljb and popfreq databases, add avdblist keyword to list all databases provided by '-webfrom annovar', add tilde expansion for annotate_variation.pl, fix bug in convert2annovar to handle gz files, add ability to handle vcf.gz file for table_annovar, change exit code for failure to downdb in annotate_variation, improve variants_reduction.pl to handle more genome builds, change FASTA line to indicate mutation position in coding_change.pl, fix exon count bug for splice variant on negative strand, improve compatibility for certain plant genomes

  • On 2015Dec14, enabled multi-threaded ANNOVAR for gene, region and filter annotation, add -idasscore argument to better handle VCFdb file in filter annotation, improve convert2annovar to better handle VCF file without header information, fix exon count bug for splice variant on negative strand

  • On 2016Feb01, fixed a bug in multi-threaded gene annotation when thread is more than 6, added -maxgenethread argument to table_annovar.pl

  • On 2017Jun01: gx operation is added in table_annovar so that xref information for genes can be included, show complete amino acid change in gene annotaion in table_annovar.pl and coding_change.pl, add ability to handle avsnp file in convert2annovar.pl, fixed a bug that misses upstream/downstream variants when -separate is used, added support to use # rather than comma in -argument and -precedance and -avcolumn and -chromosome, upstream variants now show distance to transcriptional start, splice variants at UTR now shows details

  • On 2017Jul16: fixed a bug in calculating upstream distance that print when -separate is specified in annotate_variation.pl, improvements to coding_change.pl to report more stopgain/stoploss and fix use-of-uninitialized-value issue, slight change to convert2annovar.pl to handle mal-formed VCF file.

  • On 2018Apr16: added r.spl as a notation for certain indels based on HGVS guidelines, fixed a bug in calculating amino acid sequence when a block substitution covers both exon and intron, fixed a bug in classifying block substitutions as inframe mutations in coding_change.pl, coding_change.pl now handles gene names that contains dash or dot, fixed a bug table_annovar.pl when -dot2underline is used, shortened the protein sequence calculation for nonframeshift mutations in coding_change.pl, add -keepindelref argument in convert2annovar.pl to keep Ref/Alt allelles in VCF file, improved compatibility of coding_change.pl for certain gene names that contain @ and . characters

  • On 2019Oct24: allow refGeneWithVer as a valid gene annotation when using -downdb argument in annotate_variation.pl; add -intronhgvs argument to table_annovar.pl to print out HGVS notations for intronic variants; add startloss and startgain as functional consequences that affects the first ATG codon; add -nofirstcodondel to table_annovar by default to enable calculation of amino acid changes for certain variants previously annotated as 'wholegene'; minor adjustment on nonframeshift vs startloss vs stopgain for certain variants with multiple valid notations; changed p. notation for block substitution that does not cause protein change; changed table_annovar so that ExAC and gnomAD are treated as float fields in VCF annotation; allow genericdb for region annotation; allow chromosome name to contain . or - for certain species; the -polish argument is ON by default in table_annovar.pl; table_annovar.pl can generate column headers such as Otherinfo, Otherinfo2, Otherinfo3, etc; fixed a bug of cdot notation for block substitutions that cover 5UTR and start codon