IMA Genome-F 9

Draft genomes of the species Annulohypoxylon stygium, Aspergillus mulundensis, Berkeleyomyces basicola (syn. Thielaviopsis basicola), Ceratocystis smalleyi, two Cercospora beticola strains, Coleophoma cylindrospora, Fusarium fracticaudum, Phialophora cf. hyalina and Morchella septimelata are presented. Both mating types (MAT1-1 and MAT1-2) of Cercospora beticola are included. Two strains of Coleophoma cylindrospora that produce sulfated homotyrosine echinocandin variants, FR209602, FR220897 and FR220899 are presented. The sequencing of Aspergillus mulundensis, Coleophoma cylindrospora and Phialophora cf. hyalina has enabled mapping of the gene clusters encoding the chemical diversity from the echinocandin pathways, providing data that reveals the complexity of secondary metabolism in these different species. Overall these genomes provide a valuable resource for understanding the molecular processes underlying pathogenicity (in some cases), biology and toxin production of these economically important fungi.


INTRODUCTION
Annulohypoxylon stygium (Xylariales, Ascomycota) is a white-rot fungus commonly found on dead wood (Hsieh et al. 2005). Annulohypoxylon stygium displays an extremely high performance in lignin and carbohydrate degradation. Some species of Annulohypoxylon may be used in the cultivation of Tremella fuciformis, one of the foremost medicinal and culinary fungi of China (Stamets 2000). Tremella fuciformis, the white jelly mushroom, is a symbiotic fungus that does not form an edible basidiome without the presence of a specific host fungus (Li et al. 2014). Its preferred host has traditionally been indicated as "Xianghui" in China. Recently, A. stygium was identified to be the main Xianghui species and this has been confirmed experimentally (Deng et al. 2016). Cultivators usually pair cultures of T. fuciformis with this species for industrial production and the formation of T. fuciformis basidiomes is highly dependent on the presence of the specific host fungus, both in nature and for industrial production.
To date, the symbiotic mechanism of A. stygium and T. fuciformis has not been understood yet. The genome sequence of A. stygium from this study may provide some useful information to reveal the symbiotic mechanism of A. stygium with T. fuciformis.
were used to generate read lengths of 150 bases. The CLC Genomics Workbench v. 6.0.1 (CLCBio, Aarhus, Denmark) was subsequently used to trim reads of poor quality (limit of 0.05) as well as terminal nucleotides. The remaining reads were assembled using the SPAdes 3.0.0 with an optimized k-mer value of 103 (Bankevich et al. 2012). Thereafter, scaffolding was completed using SSPACE v. 2.0 (Boetzer et al. 2011) and gaps reduced with the use of GapFiller v. 2.2.1 (Boetzer & Pirovano 2012). The completeness of the assembly was evaluated using the BUSCO v3 (Simão et al. 2015). Homology-based gene prediction and ab initio prediction were performed to search A. stygium gene models. Homologous protein from Laccaria bicolor was used for alignment to the repeat-masked A. stygium genome using Exonerate v 2.2.0 (Slater & Birney, 2005). The filtered alignment results (above 300 bp and 90 % coverage) were built as training models for ab initio gene prediction. The ab initio prediction was conducted using Augustus v. 3.2.3 (Stanke et al. 2008) and GeneMark-ES (Ter-Hovhannisyan et al. 2008) guided by training models from homology-based alignments. All gene prediction results were intergrated into the final gene models by EVidenceModeler (Haas et al. 2008). Carbohydrate-active enzymes (CAZyme), including the repertoire of auxiliary enzymes, were predicted using dbCAN (Yin et al. 2012).

RESULTS AND DISCUSSION
The genome of A. stygium had an estimated size of 47.5 Mb with an average coverage of 31.26× (Table 1). The N50 size was 598 310 bases, and the assembly had a mean GC content of 46 %. The total number of scaffold generated was 1854. MAKER predicted a total of 12 498 genes with an average length of 1662 bp. The average gene density of A. stygium was 263 genes/Mb. A phylogenetic analysis of the genus Annulohypoxylon and the closely related genus Hypoxylon is presented to reflect the position of this genome ( Fig. 1.).
The draft genome of A. stygium is larger than that of the allied species Xylaria hypoxylon OSC100004 and Hypoxylon sp. CI-4A (Wu et al. 2017), which are 42.9 Mb and 37.7 Mb, respectively. The genome is closer in size to that of Hypoxylon sp. CO27-5 and Hypoxylon sp. EC38 (Wu et al. 2017), which have genome sizes of 46.6 Mb and 47.7 Mb, respectively. Annulohypoxylon stygium also has a similar number of putative genes when compared to Hypoxylon sp. EC38 (12261 predicted gene models) and Hypoxylon sp. CO27-5 (12 256 predicted gene models).
A total 757 CAZymes were identified in the genome of A. stygium, more than that in the closely related Hypoxylon sp.  and Hypoxylon sp. CI-4A (526 CAZymes). The number of CAZymes in A. stygium was much higher than that in Tremella enchepala (265 CAZYmes;Magnuson et al. 2017) and T. mesenterica (206 CAZYmes;Floudas et al. 2012), indicating that A. stygium may assist Tremella species in the degradation of lignin and carbohydrates in nature or for industrial production. The genome sequence data of A. stygium in this study will provide useful information for understanding the mechanism of the symbiotic interaction between A. stygium and T. fuciformis.

INTRODUCTION
A strain of Aspergillus (Y-30462 = DSMZ 5745) was isolated at Hoechst India, then located in the Mulund district of Mumbai, India, from a soil sample collected in Bangladesh , Roy et al. 1987. In the original publication, the fungus was described as an unusual variant of A. sydowii because of the presence of abundant Hülle cells and was published without a Latin description or type specimen as "A. sydowii var. mulundensis". This strain was subsequently re-examined using multi-gene phylogenetic analysis, chemotaxonomic markers, and morphological data and was determined as representing a novel species within Aspergillus sect. Nidulantes (Bills et al. 2016. The primary objective for sequencing the genome of A. mulundensis was the identification of the gene clusterencoding the biosynthesis of the muludocandins (Yue et al. 2015). Mulundocandin and deoxymulundocandin (Fig. 2) are lipohexapeptides and potent antifungal antibiotics of the echinocandin class , Roy et al. 1987, Mukhopadhyay et al. 1992. Biosynthetically, they are closely related to echinocandin B, but they differ in the substitution of serine instead of threonine in the fifth position of the hexapeptide core and by a 12-methyl myristoyl side chain instead of a lineolyl side chain. Mulundocandin and its deoxymulundocandin have been investigated extensively as potential lead structures for the development of echinocandintype antifungal drugs (Mukhopadhyay et al. 1992, Hawser et al. 1999, Lal et al. 2003. This draft genome will expand genomic data sets for comparative genomics of species in Aspergillus sect. Nidulantes.

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The Aspergillus mulundensis isolate DSMZ 5745 Whole Genome Shotgun project has been deposited in GenBank under the accession number PVWQ00000000.

MATERIALS AND METHODS
For methods for DNA extraction, sequencing, and genome assembly and annotation, see Bills et al. (2016).

RESULTS AND DISCUSSION
The genome of DSMZ 5745 was sequenced to 100-fold coverage, yielding 160 scaffolds with N50 of 2.8 Mb ( Table 2). The assembled genome size was 45 Mb, and a total of 11603 genes were predicted. The GC content of this genome is 43.2 %. The genome contains 53 core catalytic genes associated with putative secondary metabolite biosynthetic gene clusters. These clusters include 25 PKSs, 19 NRPSs, one PKS-NRPS hybrids, four dimethylallyl tryptophan synthases, and four terpene synthases. These genes are distributed among 45 putative gene clusters that also include genes encoding tailoring enzymes, regulators, transporters, and other auxiliary genes. In addition to these gene clusters, 14 secondary metabolite gene clusters containing PKS-like or NRPSlike enzyme genes, or other secondary metabolic-related genes were identified by antiSMASH. In addition a gene cluster containing close orthologues of the pneumocandin gene cluster from Glarea lozoyensis (Yue et al. 2015) was recognized, and predicted to be responsible for the biosynthesis of muludocandins. The nuclear-encoded secondary metabolomes of A. mulundensis and A. nidulans FGSC A4 were compared previously (Bills et al. 2016). A

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The draft genome sequence of Berkeleyomyces basicola (CMW 49352 = CBS 142796) has been deposited at DDBJ/ ENA/GenBank under the accession number PJAC00000000. The version presented here is PJAC00000000.

MATERIALS AND METHODS
Genomic DNA was extracted from lyophilized mycelium of Berkeleyomyces basicola isolate CMW 49352 grown in malt yeast broth (2 % Malt extract, 0.5 % yeast extract; Biolab, Midrand, South Africa) using the method described by Duong et al. (2013). A paired-end library was prepared (350 bp average insert sizes) and sequenced using the Illumina HiSeqX Platform. A mate-pair library was prepared (10 Kb average insert size) and sequenced using the Illumina HiSeq2500 platform. Long reads were also generated using one cell of the Single-molecule real time (SMRT or PacBio) sequencing platform (Pacific BioScience). All sequencing was conducted at Macrogen (Seoul, Korea). Quality and adapter trimming of pair-end and mate-pair reads was carried out using Trimmomatic v. 0.36 (Bolger et al. 2014).
De novo assembly of the genome was carried out using SPAdes v. 3.9 (Bankevich et al. 2012) using all pair-end, mate-pair and PacBio data. Contigs smaller than 500 bp were removed from the dataset. Initial scaffolding was done using SSPACE-standard v. 3.0 (Boetzer et al. 2011) with the paired-end and mate-pair reads. A second round of scaffolding was done using SSPACE-Longread with the PacBio reads. Assembly gaps were filled using GapFiller v. 1.10 (Boetzer & Pirovano 2012) with the paired-end and mate-pair reads, and using PBJelly (English et al. 2012) with PacBio reads. Final genome polishing was done using Pilon (Walker et al. 2014). Genome completeness was assessed with the Benchmarking Universal Single-Copy Orthologs (BUSCO v. 1.1b1) tool using the Ascomycota dataset (Simão et al. 2015). The number of protein coding genes was determined using Augustus v. 3.3.2 (Stanke et al. 2004) using pre-optimised species models for Fusarium graminearum.

RESULTS AND DISCUSSION
The paired-end, mate-pair, and PacBio sequencing yielded 431 141 384, 60 673 400 and 42 422 reads, respectively. Final assembly consisted of 81 contigs, with the largest around 3.8 Mb and an N50 of 1.2 Mb. The estimated size of the genome is around 25.1 Mb with a GC content of 52 %. This estimated size is similar to that of other species in Ceratocystidaceae, which range between 25.4 Mb for Huntiella moniliformis and 33.6 Mb for Davidsoniella virescens (Wilken et al. 2013, Van der Nest et al. 2014a, b, Wingfield et al. 2015a, b, 2016a. The phylogenetic position of Berkeleyomyces basicola is presented in Fig. 4.
BUSCO analysis predicted an assembly completeness of 97.4 %. The assembly contained 1280 complete singlecopy BUSCOs, one complete and duplicated BUSCOs, 10 fragmented BUSCOs and 24 missing BUSCOs out of a total 1315 BUSCO groups searched. AUGUSTUS annotation predicted 10 074 putative coding regions, corresponding to around 401 ORFs/Mb. The availability of the genome for B. basicola will make possible genome comparisons with other species in Ceratocystidaceae and facilitate investigations into factors involved in pathogenicity, ecology, mating, and evolution of this important plant pathogen.

INTRODUCTION
The genus Ceratocystis as defined by De Beer et al. (2014) is a diverse assemblage of species that are best known as pathogens of angiosperm trees and commercially grown root crops (De Beer et al. 2014, Seifert et al. 2013, Van Wyk et al. 2012). Among these, C. fimbriata s. lat. is arguably the best-known pathogen, and has been associated with diseases of sweet potato (Halsted & Fairchild 1891), taro (Huang et al. 2008), pomegranate (Somasekhara 1999) and kiwifruit (Piveta et al. 2016). The related C. manginecans causes disease on mango and Acacia mangium trees in Oman and Pakistan (Al-Subhi et al. 2006, Al Adawi et al. 2013), while C. eucalypticola is responsible for mortality on commercially planted eucalypt trees in South Africa (Van Wyk et al. 2012). These fungi all share a very similar morphology, making their species boundaries difficult to determine , Harrington et al. 2014. In contrast, several other species in the genus are clearly defined, with universally accepted species status . These include C. albifundus (a pathogen of commercially propagated Acacia mearnsii and Protea cynaroides in South Africa; Lee et al. 2016), C. cacaofunesta (causing cacao wilt in the Caribbean and Central and South America; Engelbrecht et al. 2007), and C. smalleyi (agent of hickory decline in the USA; Johnson et al. 2005).
Ceratocystis smalleyi was first isolated from a hickory tree (Carya sp.) that had been infested by the hickory bark beetle Scolytus quadrispinosus (Johnson et al. 2005). In 2005, C. smalleyi was formally named and described after additional isolates were collected from Carya trees that had been attacked by the hickory bark beetle across parts of the eastern US (Johnson et al. 2005). The authors subsequently linked C. smalleyi with the decline of hickory through a possible Maximum likelihood (ML) phylogram derived from the analyses of the partial MCM7 gene sequences for species in the Ceratocystidaceae. CLCbio Genomics Workbench v. 9.5 (CLCbio, QIAGEN, Aarthus, Denmark) was used to screen the genome of B. basicola isolate CMW 49352 to identify and extract the MCM7 gene using an available reference sequence for the gene from B. basicola (Accession: MF967102). A dataset was prepared based on the phylogenies of Nel et al. (2017) and sequences were downloaded from NCBI GenBank. DNA sequence alignments of the dataset were done using the online version of MAFFT v. 7 (Katoh & Standley, 2013). The ML analyses were performed in MEGA v. 6.06 (Tamura et al., 2013)  to identify and extract the MCM7 gene using an available reference sequence for the gene from B. basicola (Accession: MF967102). A dataset was prepared based on the phylogenies of Nel et al. (2017) and sequences were downloaded from NCBI GenBank. DNA sequence alignments of the dataset were done using the online version of MAFFT v. 7 (Katoh & Standley, 2013). The ML analyses were performed in MEGA v. 6.06 (Tamura et al. 2013) using the GTR model. Values shown at nodes are confidence values >75 %. The sequence from the B. basicola genome is indicated in bold.

ARTICLE
association with the bark beetle S. quadrispinosus (Johnson et al. 2005). Later studies have confirmed C. smalleyi as a pathogen on Carya species , and established the close association between the fungus and the bark-beetle . This makes C. smalleyi the only known Ceratocystis species to be associated with a bark-beetle. In other Ceratocystis species, the production of volatiles is linked to attracting insects for dispersal (Van Wyk et al. 2009. The specific association between C. smalleyi and the vector S. quadrispinosus would eliminate the need for producing volatile attractants, and could explain the inability of this species to produce the fruity odours characteristic of other Ceratocystis species (Harrington 2009;Johnson et al. 2005).
In this study, we aimed to produce a draft genome assembly for C. smalleyi. This assembly would be the seventh Ceratocystis species for which a genome sequence is published, and adds to the valuable genomic resource available for members of Ceratocystidaceae (Molano et al. 2018, Van der Nest et al. 2014a, b, 2015, Vanderpool et al. 2017, Wilken et al. 2013, Wingfield et al. 2015a, b, 2016a. Furthermore, the availability of a genome assembly will afford the opportunity in future to investigate aspects of the unique biology of C. smalleyi.

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
This Whole Genome Shotgun project for Ceratocystis smalleyi isolate CMW 14800 has been deposited at DDBJ/ ENA/GenBank under accession NETT00000000. The version described in this paper is version NETT01000000.

MATERIALS AND METHODS
Ceratocystis smalleyi isolate CMW 14800 was obtained from the culture collection of the Forestry and Agricultural Biotechnology Institute (FABI) and grown on 2 % malt extract agar (MEA: 2 % w/v, Biolab, South Africa) at 25 °C. A 14 d old culture was used to isolate genomic DNA using a previously described phenol-chloroform protocol (Roux et al. 2004). The isolated DNA was submitted for sequencing on an Illumina Genomics Analyzer IIx at the UC Davis Genome Centre (University of California, Davis). For sequencing, paired-end libraries of 350 bp and 600 bp insert sizes were prepared and sequenced following the protocol provided by Illumina (www. illumina.com). The raw sequencing reads were imported into CLC Genomics Workbench v. 7.5.1 (CLCBio. Aarhus), and default settings were used to both trim the reads for quality and to produce a de novo genome assembly using the trimmed reads. Scaffolds were generated from the assembly using SSPACE v. 2.0 (Boetzer et al. 2011), while GapFiller v. 2.2.1 (Boetzer & Pirovano 2012) was used to fill any gaps created during scaffolding. Sequencing coverage was estimated by mapping the trimmed sequencing reads to the contigs, while an estimate of the number of putative open reading frames (ORFs) were obtained through de novo gene prediction using the web-based version of AUGUSTUS and gene models from Fusarium graminearum (Keller et al. 2011). The Benchmarking Universal Single-Copy Orthologs (BUSCO v. 1.22) tool was used in combination with the fungal data set to provide a quantitative measure of the level of genome completeness (Simão et al. 2015). The 60S, LSU and MCM7 gene regions were extracted from the genome and, together with these regions from the recently sequenced species C. cacaofunesta (Molano et al. 2018), T. punctulata isolate CMW1032 (Wilken et al. 2018), H. savannae (Van der Nest et al. 2015) and A. xylebori (Vanderpool et al. 2017) were added to the Ceratocystidaceae dataset used for phylogenetic analysis by Wingfield et al. (2017). The resulting datasets were aligned using MUSCLE (Edgar 2004), concatenated, and used to construct a Maximum Likelihood phylogeny using PhyML 3.1 (Guindon et al. 2010) based on model parameters estimated with jModelTest 2.1.10 (Darriba et al. 2012).

RESULTS AND DISCUSSION
The 27 311 342 bp Ceratocystis smalleyi genome was present in 2261 contigs, of which 1242 contigs were larger than 500 bp. The draft assembly yielded a genome with a G/C content of 50.6 %, an average coverage of 84x and 6682 predicted open reading frames at an average gene density of 245 ORFs/ Mb. BUSCO analysis indicated a genome completeness of 97 % with 1394 of the 1438 searched orthologs present in the genome being complete. In total 1330 ORFs occurred as single copies while 64 were duplicates. Of the remaining searched homologs, 37 were fragmented while the remaining seven were missing from the genome assembly.
The genome of C. smalleyi was comparable in size and gene content to that of other Ceratocystis species (Wingfield et al. 2015b(Wingfield et al. , 2016a. At 27.3 Mb, the C. smalleyi genome is slightly larger than that of the related species C. harringtonii (genome size of 26 Mb; Wingfield et al. 2016b), but smaller than the genome of C. manginecans (31.7 Mb; Van der Nest et al. 2014b). Gene densities for published Ceratocystis genomes range from 204-257 ORF/Mb (Wingfield et al. 2015b(Wingfield et al. , 2016a, and the C. smalleyi gene density falls within this range. In contrast, the 50.6 % G/C content of the C. smalleyi genome is unusually high, with all other Ceratocystis species showing G/C contents below 49 % (Wingfield et al. 2015b(Wingfield et al. , 2016a. The availability of multiple Ceratocystis genomes (Fig.  5) provides the opportunity to study the genetic aspects that underlie ecological and life-style differences between members of this genus. Understanding these differences will also be crucial in explaining at least some of the variations in gene content, genome size, and G/C content evident among these genomes. Currently, the published Ceratocystis genomes make up the bulk of the Ceratocystidaceae genome resource with published genomes available for seven species (Fig. 5; Molano et al. 2018, Van der Nest et al. 2014a, b, Wilken et al. 2013, Wingfield et al. 2015b, 2016b. In addition, published genome sequences are available for five Huntiella species (Van der Nest et al. 2014a, b, 2015, Wingfield et al. 2016b, two Endoconidiophora species (Wingfield et al. 2016a), three isolates representing two Thielaviopsis species (Wilken et al. 2018, Wingfield et al. 2015aWingfield et al. 2015b as well as for C. adiposa (Wingfield et al. 2016a). This brings the number of published Ceratocystidaceae genomes to 21, with the genome assemblies of several others publicly available (www.ncbi.nlm.nih.gov/assembly/?term=ceratocyst idaceae). Such a vast genomic resource will prove valuable to future studies on Ceratocystidaceae, a family that include fungal species with diverse life-styles and hosts.

INTRODUCTION
The genus Cercospora (Mycosphaerellaceae) includes several economically important plant pathogens causing leaf and fruit spots on a range of agricultural crops worldwide (Groenewald et al. 2013). Cercospora species are known to produce cercosporin, a photo-activated toxin that contributes to pathogenicity on a broad range of crops (Daub et al. 2000). Cercospora beticola is the cause of Cercospora leaf spot (CLS) on sugar and table beet (Beta vulgaris ssp. vulgaris), and Swiss chard (Beta vulgaris ssp. cicla) worldwide (Franc 2010). In New York, CLS is the most important disease affecting foliar health of table beet. Symptoms include leaf spots and necrotic lesions with red to purple margins, which coalesce as the disease progresses, and can result in complete defoliation (Pethybridge et al. 2017). In broadacre production systems, maintenance of foliar health is important to enable mechanized harvest. For fresh market sales, the presence of CLS lesions on the leaves may result in rejection (Pethybridge et al. 2017).
The control of CLS in table beet is dependent on fungicides (Pethybridge et al. 2017). However, resistance to single-site mode of action fungicides threatens the durability of CLS control. Recent studies reported a high frequency of isolates with resistance to quinone outside inhibitor fungicides in New York (Vaghefi et al. 2016). Moreover, succinate dehydrogenase inhibitor fungicides, which are known to be effective in controlling CLS on sugar beet, failed to provide efficacious control on table beet (Pethybridge et al. 2017), and a few isolates with reduced sensitivity to demethylation inhibitors have been detected (Pethybridge, unpubl.). Identifying genomic regions associated with sensitivity to fungicides will enable rapid screening of C. beticola populations. Enhanced genomic information for this pathogen will also facilitate studies into the mechanisms of pathogenicity. De novo genome assembly of two C. beticola strains from table beet are presented here, and made publically available to facilitate genetic studies of this globally important plant pathogen.

SEQUENCED STRAINS
USA: New York: western New York, Batavia, from Beta vulgaris ssp. vulgaris (table beet)

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The Whole Genome Shotgun projects have been deposited at DDBJ/EMBL/GenBank under the accessions PDUH00000000 and PDUI00000000.
A total of 5.6 and 5.0 µg genomic DNA of ICMP 21692 and ICMP 21690 were used to prepare PCR-free libraries with average insert of ~550 bp, using the Illumina paired-end (2×300 bp) MiSeq platform at the Cornell University Institute of Biotechnology Genomics Facility (Ithaca, NY). PCR-free libraries were constructed using Illumina's TruSeq Nano DNA LT Sample Preparation kits, according to the manufacturer's protocol. This yielded 4 607 564 and 4 798 846 paired-end reads, totalling 2.7 and 2.9 Gb data for ICMP 21692 and ICMP 21690, respectively. Quality control of the sequences was conducted using FastQC v.0.11.2 (http://www.bioinformatics. bbsrc.ac.uk/projects/fastqc) in the GALAXY portal (Afgan et al. 2016 De novo genome assembly was conducted using DISCOVAR de novo v.52488; an assembler designed for de novo assembly of long Illumina paired-end reads from single PCR-free libraries (Weisenfeld et al. 2014). The completeness of the final assemblies was assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO) v.1.2 (Simão et al. 2015). Gene prediction was conducted in the genome annotation pipeline Maker v.2.31.9 (Cantarel et al. 2008), using contigs at least 500 bp in length only for ICMP 21692. A preliminary annotation used the ab initio gene prediction program SNAP (Korf 2004). The resulting annotation was used to produce a hidden-markov-model (HMM) profile for C. beticola, which was further refined with a second stage of SNAP training. The refined HMM file was used for the final annotation (Cantarel et al. 2008).

RESULTS AND DISCUSSION
The Illumina paired-end (2×300 bp) sequencing of C. beticola isolates ICMP 21692 and ICMP 21690, resulted in 4 607 564 and 4 798 846 reads for each strain, respectively, with mean base quality of 28.5 and 28.7. The estimated genome size of C. beticola was ~37 Mb, based on an approximated genome coverage for both strains of at least 74×. The draft genome of ICMP 21692 had a total assembly size of ~35.03 Mbp (for 1 kb+ scaffolds), a scaffold N50 value of 1 023 488 bp, and maximum contig size of 3 283 856 bp. The draft genome of ICMP 21690 had a total assembly size of ~34.5 Mbp (for 1 kb+ scaffolds), and a scaffold N50 value of 654 439 bp, and maximum contig size of 2 437 838 bp. Both assemblies were provided the foundation for global population genetics studies of C. beticola using microsatellite markers and Genotyping-By-Sequencing (Vaghefi et al. 2017a, b). Current studies are focused on identification and characterisation of the genes responsible for sensitivity to fungicides. Availability of genomic data will provide a powerful tool for characterising the genes involved in pathogenicity.  Kanasaki et al. 2006) (Fig. 2). FR220897 and FR220899 are isomers of FR901379 which is used for semisynthesis of micafungin. FR901379 is produced by a different strain of C. empetri F-11899 (Iwamoto et al. 1994a, b). Differential antifungal activity of these isomers was critical to understanding the effects of the position of the homotyrosine sulfate residue on the antifungal activity (Hino et al, 2001, Kanasaki et al. 2006. Like other echinocandins, the metabolites strongly inhibited β-1,3-glucan synthase and exhibited potent in vitro activity against Candida albicans and Aspergillus fumigatus, and FR220897 was effective in mouse candidiasis models. The discovery of these echinocandin variants was significant because sulfation of the homotyrosine residue overcomes the inherent poor water-solubility that had previously impeded development of echinocandin-type of antibiotics, including echinocandin B, aculeacins, and the pneumocandins.

Authors
Coleophoma cylindrospora is a widespread endophyte and leaf saprobe and can be a weak pathogen of leaves and fruits of many woody plants (Sutton 1980, Wu et al. 1996, Polashock et al. 2009, Crous & Groenewald 2016. The phylogenetic affinity of the strain producing FR220897 and FR220899 was established with multiple phylogenetic marker sequences and was found to be conspecific with other strains of C. empetri (Yue et al. 2015) (Fig. 7). Subsequently, during a revision of the polyphyletic genus Coleophoma, C. empetri was found to be phylogenetically indistinct from the similar C. cylindrospora and was considered to be a synonym of the latter (Crous & Groenewald 2016).
The primary objective behind the sequencing the genome of C. cylindrospora was the identification of the gene clusterencoding the biosynthesis of FR220897 and FR220899 (Yue et al. 2015). The genome sequence will be essential for identifying the mechanism of the regiospecific sulfation reaction. The draft genome also has revealed that the strain harbours an auxiliary copy of β-1,3-glucan synthase that may function as an echinocandin resistance gene (Yue et al. 2018). This draft genome will expand genomic data sets for comparative genomics of species in Leotiomycetes, Dermataceae, and endophytic fungi in general.

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The C. cylindrospora isolate BP6252 Whole Genome Shotgun project has been deposited in GenBank under the accession number PDLM00000000.

RESULTS AND DISCUSSION
The genome of BP6252 was sequenced to 100-fold coverage, yielding 77 scaffolds with N50 of 2.3 megabases (Mb). The assembled genome size was 42.4 Mb, and a total of 14,177 genes were predicted. The GC content of this genome is 48.7 %. The genome contains 26 core catalytic genes associated with putative secondary metabolite biosynthetic gene clusters. These clusters include 15 PKSs, eight NRPSs, two dimethylallyl tryptophan synthases, and one terpene synthase. These genes are distributed among 21 putative gene clusters that also include genes encoding tailoring enzymes, regulators, transporters, and other auxiliary genes. In addition to these gene clusters, nine secondary metabolite gene clusters containing PKS-like or NRPS-like enzyme genes, or other secondary metabolic-related genes were identified by antiSMASH. In addition a gene cluster containing close orthologues of the pneumocandin gene cluster from Glarea lozoyensis (Yue et al. 2015) was recognized and predicted to be responsible for the biosynthesis of FR220897 and FR220899.   (Wu et al. 1996). The strain was fermented to produce three watersoluble echinocandin analogues, designated FR209602, FR209603 and FR209604 (Fig. 2). These analogues differ from FR901379 (WF11899A) and its analogues by a substitution of threonine for serine at the peptide's third amino acid and deoxygenation of the homotyrosine residue at C-4. Like other echinocandins, these metabolites strongly inhibited activity of β-1,3-glucan synthase and exhibited potent in vivo activity against C. albicans and A. fumigatus in murine systemic infection models.
The phylogenetic affinity of the strain producing FR209602 and analogues was established with multiple phylogenetic marker sequences (Yue et al. 2015) (Fig. 7). Although, we had retained the original identification of C. crateriformis in previous work on the evolution of the echinocandin pathways, a multi-gene phylogeny indicated the strain was conspecific with other strains named as C. empetri. Subsequently, during a revision of the polyphyletic genus Coleophoma, it was noted that an authentic strain of C. crateriformis, the type species of the genus Coleophoma, was lacking, and thus, its phylogenetic affinities within the genus remained to be determined (Crous & Groenewald 2016). Because strain BP5796 appears to be phylogenetically indistinct from the similar C. cylindrospora, we consider it to be conspecific with the latter (Crous & Groenewald, 2016).
The primary motivation for sequencing the genome of C. cylindrospora BP5796 was to identify the gene clusterencoding the biosynthesis of FR209602. The genome sequence will be essential for identification of the mechanism of the regiospecific sulfation reaction. The draft genome also has revealed, that like BP6252, the strain harbours an auxiliary copy of β-1,3-glucan synthase that may function as an echinocandin resistance gene (Yue et al. 2018). This draft genome will expand resources for comparative genomics of species in Dermataceae and endophytic fungi.

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The C. cylindrospora isolate BP5796 Whole Genome Shotgun project has been deposited in GenBank under the accession number PDLN00000000.

MATERIALS AND METHODS
The methods for DNA extraction, sequencing, and genome assembly and annotation were essentially the same as for strain BP6252 above.

RESULTS AND DISCUSSION
The genome of BP5796 was sequenced to 100-fold coverage, yielding 45 scaffolds with N50 of 2.0 Mb. The assembled genome size was 40.4 Mb, and a total of 13257 genes were predicted. The GC content of this genome is 48.5 %. The genome contains 24 core catalytic genes associated with putative secondary metabolite biosynthetic gene clusters. These clusters include 15 PKSs, six NRPSs, two dimethylallyl tryptophan synthases, and one terpene synthase. These genes are distributed among 21 putative gene clusters that also include genes encoding tailoring enzymes, regulators, transporters, and other auxiliary genes. In addition to these gene clusters, seven secondary metabolite gene clusters containing PKS-like or NRPS-like enzyme genes, or other secondary metabolic-related genes were identified by antiSMASH. In addition, a gene cluster containing close orthologues of the pneumocandin gene cluster from Glarea lozoyensis (Yue et al. 2015) was recognized, and predicted to be responsible for the biosynthesis of FR206902. This gene cluster deviated from other echinocandin gene clusters by the loss of a cytochrome P450 gene orthologous to htyF in A. pachycristatus and GLP450-1 in Glarea lozoyensis which

INTRODUCTION
The genus Fusarium contains numerous well-known socioeconomically important fungi (Nelson et al. 1983). Many of these fungi form part of the Fusarium fujikuroi Species Complex  for which various whole genome sequences have been published, e.g. Fusarium fujikuroi (Jeong et al. 2013, Wiemann et al. 2013, Chiara et al. 2015, Fusarium temperatum (Wingfield et al. 2015b) and Fusarium circinatum , van der Nest et al. 2014a). The latter causes pitch canker, which is a devastating disease of pine (Wingfield et al. 2008). Of the five other species found to be associated with F. circinatum-like symptoms on pine in Colombia (Herron et al. 2015), the genome of F. pininemorale has been sequenced . In this study, we determined the whole genome sequence for F. fracticaudum, which was also described by Herron et al. (2015). Like F. pininemorale, this species does not seem to be a pathogen of pine as it could not incite lesions on the stems of pine seedlings in standard pathogenicity assays (Herron et al. 2015).These differences between F. circinatum and these non-pathogenic Fusarium species on Pinus will provide an opportunity for genome comparisons.
The association of F. fracticaudum with diseased pines and the genetic basis of biological traits in F. fracticaudum is not yet understood. Availability of various sequenced genomes of species within the FFSC is enabling studies into the biology and evolution of these fungi (Ma et al. 2013, De Vos et al. 2014, Niehaus et al. 2016). Here we determine the genome sequence of F. fracticaudum, which will provide an additional resource for comparative genomic studies aimed at understanding the evolution of these fungi and unravelling the molecular basis of their plant interactions.

NUCLEOTIDE ACCESSION NUMBER:
This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession PDNT00000000. The version described in this paper is version PDNT01000000.

Genome sequence
The F. fracticaudum isolate was grown on ½ Potato Dextrose agar (PDA; BD Difco TM ) at 25 °C for 7 d. Genomic DNA was extracted from fungal mycelium following the protocol of Möller et al. (1992). Genome sequencing was done with one paired-end (350 bp median insert size) and one mate-pair (5 kb median insert size) library using Illumina HiSeq XTen and Hiseq2000 platforms respectively, at Macrogen (Seoul, Korea). CLC Genomics Workbench v. 8.0.1 (CLCBio, Aarhus Denmark) was used to trim sequences less than 18 bp. The quality filtered reads were subjected to de novo assembly in ABySS v. 1.3.7 (Simpson et al. 2009), followed by scaffolding with SSPACE v 2.0 (Boetzer et al. 2011). The gaps within the sequences were closed using Gapfiller v. 1.11 (Boetzer & Pirovano 2012). To determine the completeness of the genome assembly, BUSCO v2.0.1 (Benchmarking Universal Single Copy Orthologs; Simão et al. 2015) was employed using the sordariomycete dataset. Scaffolds were compared to those of the chromosomes of F. fujikuroi (Wiemann et al. 2013) and F. temperatum (Wingfield et al. 2015b) using the LASTZ plugin (Harris 2007) of Geneious v 7.0.4 (Kearse et al. 2012). WebAUGUSTUS (Hoff & Stanke 2013) was used to predict genes using the Fusarium graminearium model (http://bioinf.uni-greifswald.de/augustus) and the cDNA data from the F. circinatum genome ) as gene evidences.

Phylogenetic analysis
Phylogenetic analysis was conducted using partial sequences of the elongation factor 1-α (ef1-α) and betatubulin genes from other species in the Fusarium fujikuroi species complex (Herron et al. 2015), including the genome of F. fracticaudum determined here. All gene sequences were aligned using MAFFT (Katoh et al. 2009). A maximum likelihood phylogenetic analysis was carried out in PhyML v 3.1 (Guindon et al. 2010) using the GTR+I+G substitution model with 1000 bootstraps, as determined using jModelTest v 2.1.7 (Darriba et al. 2012).

RESULTS AND DISCUSSION
The assembled genome of F. fracticaudum was 46.29 Mb long with a GC content of 47.6 %. The assembly consisted of 50 scaffolds with an N50 value of 4 491 441 bp. WebAUGUSTUS predicted a total of 14 729 open reading frames (ORFs) in the assembly. Based on the BUSCO results, the assembly was 98.8 % complete (i.e., complete and single-copy BUSCOs = 97.6 %; complete and duplicated BUSCOs = 1.2 %; fragmented BUSCOs = 0.9 %; missing BUSCOs = 0.3 %; number of BUSCOs searched = 3725). The phylogeny inferred using two protein-coding genes also shows the previously reported relationships among the FFSC species included (Fig.  8.) (O'Donnell et al. 2000, Geiser et al. 2005, Herron et al. 2015. The sequences extracted from F. fracticaudum also grouped with those of another isolate (CBS 137233) . 8. A Maximum Likelihood phylogeny showing the placement of the F. fracticaudum isolate (indicated in bold) that was sequenced in this study. The tree was inferred from combined β-tubulin and translation elongation factor 1-α gene sequences (Herron et al. 2015). Values at branch nodes are the bootstrapping confidence values with those ≥ 85% shown. The scale bar indicates substitution per site.
In terms of overall genome statistics, the whole genome sequence of F. fracticaudum is similar to those reported for F. pininemorale, F. circinatum, and F. temperatum (Table 3). Also, F. fracticaudum contained the reciprocal translocation between chromosome 8 and 11 known in these fungi (De Vos et al. 2014). However, sequence comparisons showed that chromosome 12, which is dispensable in other members of the FFSC (Xu et al. 1995), is 1 094 708 bp in size in F. fracticaudum. This is considerably larger than the 692 922 bp reported for F. fujikuroi (Wiemann et al. 2013), 986 231 bp in F. temperatum (Wingfield et al. 2015b), 791 442 bp in F. nygamai (Wingfield et al. 2015a) and 968 722 bp reported in F. pininemorale ). The differences observed in these genomes highlight the importance of sequencing the genomes of additional species in the FFSC. The F. fracticaudum sequenced here, together with those of other FFSC species will undoubtedly provide a platform to answer numerous questions pertaining to the evolutionary history of these fungi and their species-specific traits. They fermented the strain to produce the watersoluble echinocandin analogue FR190293 (Fig. 2). Like other echinocandins, FR190293 strongly inhibited β-1,3glucan synthase and exhibited potent in vitro activity against Candida. albicans and Aspergillus fumigatus. The discovery of this new echinocandin variant was significant because it is the first of the echinocandins to have a dimethyl myristic acid acyl side chain, as in the pneumocandins, in combination with a sulfated homotyrosine residue.

Authors
As previously reported, in-depth phylogenetic and morphological analysis of BP5553 demonstrated that the identification as the rotifer parasite T. parasiticum (syn. Pochonia parasitica) was erroneous. Rather than belonging to Clavicipitaceae, BP5553 was found to belong in Helotiales (Yue et al. 2015). Based on rDNA and other protein-encoding sequences, BP5553 falls within a monophyletic lineage along with ex-type strains of Phialophora hyalina, Pleuroascus nicholsonii, Scopulariopsis parva, and Scopulariopsis parvula (Fig. 7). These strains, along with other species with Phialophora-like conidial morphs in Helotiales and BP5553 will eventually comprise a new genus in a new family of Helotiales (W. Untereiner et al., unpubl.).
The primary objective behind the sequencing the genome of Phialophora cf. hyalina was the identification of the gene cluster-encoding the biosynthesis of FR190293 (Yue et al. 2015). This draft genome will expand genomic data sets for comparative genomics of species in Leotiomycetes and Helotiales.

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The Phialophora cf. hyalina isolate BP5553 Whole Genome Shotgun project has been deposited in GenBank under the accession number NPIC00000000.

MATERIALS AND METHODS
The methods for DNA extraction, sequencing, and genome assembly were essentially the same as for strain BP6252 above.

RESULTS AND DISCUSSION
The genome of BP5553 was sequenced to 102-fold coverage, yielding 32 scaffolds with N50 of 3.8 Mb. The assembled genome size was 33.6 Mb, and a total of 10 707 genes were predicted. The GC content of this genome is 48.2 %. The genome contains 45 core catalytic genes associated with putative secondary metabolite biosynthetic gene clusters. These clusters include 19 PKSs, 13 NRPSs, six PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, four terpene synthases, and one chalcone synthase. These genes are distributed among 40 putative biosynthetic gene clusters that also include genes encoding tailoring enzymes, regulators, transporters, and other auxiliary genes. In addition to these gene clusters, eight secondary metabolite gene clusters containing PKSlike or NRPS-like enzyme genes, or other secondary metabolic-related genes were identified by antiSMASH. In addition a gene cluster containing close orthologues of the pneumocandin gene cluster from Glarea lozoyensis (Yue et al. 2015) was recognized, and predicted to be responsible for the biosynthesis of FR190293.  (Du et al. 2012, Kanwal et al. 2011, Richard et al. 2015. They have been collected by mycophiles and gourmets for hundreds of years for their delicate taste and unique appearance (Tietel & Masaphy 2017, Rotzoll et al. 2006. Morels are also found containing a variety of secondary metabolites with medicinal properties (Tietel & Masaphy 2018, Shameem et al. 2017, Pfab et al. 2008, Vieira et al. 2016. Morchella septimelata is a black morel, belonging to the Morchella elata clade (Kuo et al. 2012). It was often found in lightly to moderately burned conifer forests, near creek beds, springs and seeps, at an altitude of 1000-2000 m (Pildain et al. 2014). The ascomata of M. septimelata can be found primarily in years immediately following forest fires, and then often appearing in dwindling numbers for several seasons

NUCLEOTIDE SEQUENCE ACCESSION NUMBER
The Whole Genome Shotgun project M. septimelata isolate (Culture collection number SAAS91) has been deposited at DBJ/EMBL/GenBank under the accession number PYSJ00000000. The version described in this paper is version PYSJ01000000.

MATERIALS AND METHODS
Morchella septimelata MG91 was isolated from forest soil in Liangshan Yi Autonomous Prefecture, Sichuan, China, and was preserved in the Fungal Culture Collection Center of Biotechnology and Nuclear Technology Research Institute (Chengdu, Sichuan). Genomic DNA was extracted from MG91 and subjected to sequencing on the Genome Analyzer IIx next-generation sequencing platform (Illumina) at the BGI (Shenzhen, China). Paired-end libraries with respective insert sizes of 425 bp and 725 bp were used to generate read lengths of 150 bases. The CLC Genomics Workbench v. 6.0.1 (CLCBio, Aarhus, Denmark) was subsequently used to trim reads of poor quality (limit of 0.05) as well as terminal nucleotides. The remaining reads were assembled using the SPAdes 3.0.0 with an optimized k-mer value of 21 (Bankevich et al. 2012). Thereafter, scaffolding was completed using SSPACE v. 2.0 (Boetzer et al. 2011) and gaps reduced with the use of GapFiller v. 2.2.1 (Boetzer & Pirovano 2012). The completeness of the assembly was evaluated using the BUSCO v3 (Simão et al. 2015).
Homology-based gene prediction and ab initio prediction were performed to search M. septimelata gene models. Homologous protein from Tuber melanosporum was used for alignment to the repeat-masked M. septimelata genome using Exonerate v 2.2.0 (Slater & Birney 2005). The filtered alignment results (above 300 bp and 90 % coverage) were built as training models for ab initio gene prediction. The ab initio prediction was conducted using Augustus v. 3.2.3 (Stanke et al. 2008) and GeneMark-ES (Ter-Hovhannisyan et al. 2008) guided by training models from homology-based alignments. All gene prediction results were integrated into the final gene models by EVidenceModeler (Haas et al. 2008). Carbohydrate-active enzymes (CAZyme), including the repertoire of auxiliary enzymes, were predicted using dbCAN (Yin et al. 2012).
To verify the species identity of the sequenced strain, the Translation Elongation Factor 1-alpha gene for selected Morchella species (Fig. 9.) were aligned with mafft (Katoh & Standley 2013). The Bayesian inference (BI) method (Erixon et al. 2003) was used to construct the phylogenetic tree of different Morchella species. JMODELTEST 2.0.2 was used to ascertain the best-fit model for nucleotide substitutions (Darriba et al. 2012). BI analysis was performed with MrBayes v3.2.6 (Ronquist et al. 2012). Two independent runs with four chains (three heated and one cold) each were conducted simultaneously for 2 x 10 6 generations. Each run was sampled every 100 generations. We assumed that stationarity had been reached when estimated sample size (ESS) was greater than 100, and the potential scale reduction factor (PSRF) approached 1.0. The first 25 % samples were discarded as burn-in, and the remaining trees were used to calculate Bayesian posterior probabilities (BPP) in a 50 % majority-rule consensus tree.

RESULTS AND DISCUSSION
The genome of M. septimelata had an estimated size of 49.81 Mb with an average coverage of 151.17 times ( Table 4). The Scaffold N50 size was 37 734 bases, and the assembly had a mean GC content of 47.40 %. The total number of scaffold generated was 6525. A total of 11 427 genes were predicted with an average length of 1 571 bp. A phylogenetic analysis of the genus Morchella is provided to show position of M. septimelata (Fig. 9).
A total 512 CAZymes were identified in the genome of M. septimelata, which is more than that of the closely related species, M. conica CCBAS932 (401 CAZymes) and M. importuna SCYDJ1-A1 (403 CAZymes), indicating that the carbohydrate degradation ability of M. septimelata may be stronger than that of the other two closely related species. A total of 9 secondary metabolite (sM) clusters were found in the M. septimelata genome, of which 3 sM clusters were for terpenes. The genome sequence data of M. septimelata I M A F U N G U S presented in this study will provide useful information for understanding the synthesis mechanism of secondary metabolites in M. septimelata and lay a foundation for the artificial cultivation of M. septimelata.  Average gene length (bp) 1 571 Average gene density (genes/Mb) 229

Predicted secondary Metabolite (sM) Clusters
Total SM clusters 9 Terpene clusters 3 Type I polyketide synthetases (PKSs) 1 Nonribosomal peptide synthetases (NRPSs) 1 Others 4 Fig. 9. A Bayesian inference (BI) phylogenetic analysis of genus Morchella using MrBayes v3.2.6 based on partial gene sequences of elongation factor 1-alpha (EF1-α) gene. Posterior probabilities are shown on the nodes of the tree. The Morchella septimelata isolate used for verification was extracted from the assembled genomes. Reference sequences are obtained from the NCBI database with accession number.
V O L U M E 9 · N O . 1