Skip to content
Snippets Groups Projects

Mitochondrion

Check available resources on mitochondrion and assembly

Available mitochondrion genome complete

NCBI accession project sample submission reference size (bp)
NC_042226 PRJNA541607 N/A 7-May-2019 Chen et al 16,061
MH536744 PRJNA541607 N/A 25-Jun-2018 Chen et al 16,061
LR991693 PRJEB42171 SAMEA7521635 25-Jan-2021 N/A (Cambridge) 22,664
MT483700 N/A N/A 18-May-2020 N/A (Copenhagen) 17,279
  • NC_042226 and MH536744 are identicals.
  • LR991693 is extracted from the genome assembly aRanTem1.1
  • MT483700 is partial.

Available nuclear genome complete

We can extract mitochondrion genome sequence from nuclear genome sequence.

Genome assembly project sample submission reference
aRanTem1.1 PRJEB42171 SAMEA7521635 28-Jan-2021 N/A (Cambridge)
UCB_Rtem_1.0 PRJNA550264 SAMN12123141 27-Dec-2019 N/A (Berkeley)
  • The mitochondrion assembly LR991693 is extracted from aRanTem1.1
  • No mitochondrion extracted from UCB_Rtem_1
  • UCB_Rtem_1 illumina raw reads necessary to extract mitogenome are not available https://www.ncbi.nlm.nih.gov/Traces/wgs/VIAC01

Designing mitochondrial baits

We considered the two high-quality mitochondrial genomes assemblies available in NCBI, MH536744 and MT483700. Despite being less properly assebmled, a third non-annotated mitocondrion sequence (LR991693) was included here since it aligned correctly except for two regions of 3500 and 1500 bp. We targeted the complete mitochondrion genome, including less conservative regions in both intragenic and coding genes. To that end, we run MAFFT to align the three above-mentioned mitochondial genomes using Geneious. Positions containing any gaps were masked and removed. Baits were designed as 180 base pairs length, spread across the entire mitochondrial genome (no overlapping sequences as no need for a mitochondrial assembly) using BaitDesigner. The T7 promoter was added to the 5' end of the baits for later transcription using the NEB kit E2040S. Altogether, there was only one ambiguity character which was simply depreciated.

Transcription of DNA baits to RNA

As the mitochondiral genome of Rana temporaria is about 16,000 bp, 88 baits were designed per mitochondrion, we obtained a total of 286 baits. We synthetized the baits using a new technology for de-novo DNA synthesis available in TWIST. We received a equimolar pool of the baits in one unique tube.

Hybridization baits and sequencing

The baits will be received as DNA but will be rapidly transformed into RNA (to facilitate hybridization) following the same step described in the hyRAD protocol. Hybridization will allow 5 to 10% divergence between the DNA sequence and the bait sequence, allowing a maximum of 9 SNPs per sequence captured. The specificity of the baits will be adjusted according to the hybridization temperature, which means that two hybridization-captures are usually performed, one at 55°C and the other at 65°C (for more specificity). No species phylogenetically related to Rana temporaria are present on the ponds we sampled.
Removal of non-Rana temporaria*** sequences will be done by post-computational processing and Miseq sequencing of our eDNA samples to screen for other species present in the eDNA sample.

Preliminary results from the My Seq sequencing (Benjamin Penaud, MBB, 30/05/2023)

1- Count numer of row of the raw data

Run Miseq 329

: FOllowing individuals are missing : 3ind11, 2ind14, 1ind18, 2ind28, 1ind27


seq_tot=0
n=0
for file in Data/Miseq329/fastq/*_R1_*.fastq.gz
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_.*.fastq.*//g' | sort | uniq)
line=$(zcat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en données brutes au total, soit "$moy" séquences par échantillon en moyenne"  
sample count
1ind15 42854
1ind17 67398
1ind28 44643
2ind11 123292
2ind12 89248
2ind13 104816
2ind18 103602
2ind19 90829
2ind27 141402
3ind10 218498
Undetermined 96730

Il y a 1123312 séquences en données brutes au total, soit 102119 séquences par échantillon en moyenne

Run Miseq 330


seq_tot=0
n=0
for file in Data/Miseq330/fastq/*_R1_*.fastq.gz
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_.*.fastq.*//g' | sort | uniq)
line=$(zcat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en données brutes au total, soit "$moy" séquences par échantillon en moyenne"  
sample count
5ind14 51049
6ind17 46219
7ind10 79654
7ind11 94112
7ind15 61165
7ind18 49619
7ind19 60693
7ind27 56987
7ind28 59125
8ind12 77371
8ind13 94649
Undetermined 91662

NUmber of raw sequences at total : 822305 séquences, per sample: 68525

2- Reads quality

Miseq 329


for file in Data/Miseq329/fastq/*_R1_*.fastq.gz
do
fastqc --outdir Analysis/Miseq329/Quality/R1/ $file
done

multiqc -m fastqc Analysis/Miseq329/Quality/R1/ -o Analysis/Miseq329/Report/ -n QC_329_R1_Report

for file in Data/Miseq329/fastq/*_R2_*.fastq.gz
do
fastqc --outdir Analysis/Miseq329/Quality/R2/ $file
done

multiqc -m fastqc Analysis/Miseq329/Quality/R2/ -o Analysis/Miseq329/Report/ -n QC_329_R2_Report

Miseq 330


for file in Data/Miseq330/fastq/*_R1_*.fastq.gz
do
fastqc --outdir Analysis/Miseq330/Quality/R1/ $file
done

multiqc -m fastqc Analysis/Miseq330/Quality/R1/ -o Analysis/Miseq330/Report/ -n QC_330_R1_Report

for file in Data/Miseq330/fastq/*_R2_*.fastq.gz
do
fastqc --outdir Analysis/Miseq330/Quality/R2/ $file
done

multiqc -m fastqc Analysis/Miseq330/Quality/R2/ -o Analysis/Miseq330/Report/ -n QC_330_R2_Report

3- Demultiplexage


conda activate cutadapt

Miseq 329


mkdir Analysis/Miseq329/Demultiplex
awk '{print $3}' Data/Rana_barcodes_libs_329.csv |sort|uniq > Data/list_lib_329.txt

while read a
    do
    echo $a
    grep "$a" Data/Rana_barcodes_libs_329.csv |awk '{print ">"$1"\n"$2}' > Data/Rana_barcodes_libs_329.fasta
    cutadapt -e 0.17 --minimum-length 30 -q 10 --no-indels -g file:Data/Rana_barcodes_libs_329.fasta -o Analysis/Miseq329/Demultiplex/demux-{name}_R1.fastq.gz -p Analysis/Miseq329/Demultiplex/demux-{name}_R2.fastq.gz Data/Miseq329/fastq/"$a"_*_R1_001.fastq.gz Data/Miseq329/fastq/"$a"_*_R2_001.fastq.gz
    done < Data/list_lib_329.txt



seq_tot=0
n=0
for file in Analysis/Miseq329/Demultiplex/*_R1.fastq.gz
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_R1.fastq.*//g' | sort | uniq)
line=$(zcat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en sortie de démultiplexage, soit "$moy" séquences par échantillon en moyenne"  

sample count
demux-SPY_201_291 91548
demux-SPY_201_293 20675
demux-SPY_201_294 41615
demux-SPY_201_296 2417
demux-SPY_201_297 65611
demux-SPY_211_195 39569
demux-SPY_211_196 29336
demux-SPY_211_200 740
demux-SPY_211_201 65722
demux-SPY_211_203 61151
demux-SPY_211_210 59827
demux-SPY_211_212 59240
demux-SPY_211_213 64286
demux-unknown 921

Il y a 602658 séquences en sortie de démultiplexage, soit 43047 séquences par échantillon en moyenne Suite à cette étape, 73.29% des reads ont été associés à un échantillon.

miseq 330


mkdir Analysis/Miseq330/Demultiplex
awk '{print $3}' Data/Rana_barcodes_libs_330.csv |sort|uniq > Data/list_lib_330.txt

while read a
    do
    echo $a
    grep "$a" Data/Rana_barcodes_libs_330.csv |awk '{print ">"$1"\n"$2}' > Data/Rana_barcodes_libs_330.fasta
    cutadapt -e 0.17 --minimum-length 30 -q 10 --no-indels -g file:Data/Rana_barcodes_libs_330.fasta -o Analysis/Miseq330/Demultiplex/demux-{name}_R1.fastq.gz -p Analysis/Miseq330/Demultiplex/demux-{name}_R2.fastq.gz Data/Miseq330/fastq/"$a"_*_R1_001.fastq.gz Data/Miseq330/fastq/"$a"_*_R2_001.fastq.gz
    done < Data/list_lib_330.txt



seq_tot=0
n=0
for file in Analysis/Miseq330/Demultiplex/*_R1.fastq.gz
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_R1.fastq.*//g' | sort | uniq)
line=$(zcat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en sortie de démultiplexage, soit "$moy" séquences par échantillon en moyenne"  

sample count
demux-RTE-01 9892
demux-RTE-02 19171
demux-RTE-03 7848
demux-RTE-04 10971
demux-RTE-05 5150
demux-RTE-06 7031
demux-RTE-07 4751
demux-RTE-08 9706
demux-RTE-09 6480
demux-RTE-10 10562
demux-RTE-11 6025
demux-RTE-12 9854
demux-RTE-13 7405
demux-RTE-15 7474
demux-RTE-16 7267
demux-RTE-17 2977
demux-RTE-18 9351
demux-RTE-19 7612
demux-RTE-20 8404
demux-RTE-21 7273
demux-RTE-22 4819
demux-RTE-23 6676
demux-RTE-24 4155
demux-RTE-25 8691
demux-RTE-26 8258
demux-RTE-27 11353
demux-RTE-28 4058
demux-RTE-29 5369
demux-RTE-30 5697
demux-RTE-31 4698
demux-RTE-32 4095
demux-RTE-33 7915
demux-RTE-34 8185
demux-RTE-35 13238
demux-RTE-36 9931
demux-RTE-37 5676
demux-RTE-38 4741
demux-RTE-39 4131
demux-RTE-40 12296
demux-RTE-41 8185
demux-RTE-42 5291
demux-RTE-43 12023
demux-RTE-44 14499
demux-RTE-45 14828
demux-RTE-46 10581
demux-RTE-47 10410
demux-RTE-48 10402
demux-RTE-49 6758
demux-RTE-50 13630
demux-RTE-51 9575
demux-RTE-52 10550
demux-RTE-53 12477
demux-RTE-54 10075
demux-RTE-55 14647
demux-RTE-56 5217
demux-RTE-57 17325
demux-RTE-59 9271
demux-RTE-61 7301
demux-RTE-62 11698
demux-RTE-63 5459
demux-RTE-64 6300
demux-RTE-65 13997
demux-RTE-66 28768
demux-RTE-67 10446
demux-RTE-68 12291
demux-RTE-69 13223
demux-RTE-70 8092
demux-RTE-71 10300
demux-RTE-72 9385
demux-RTE-73 6688
demux-RTE-74 8474
demux-RTE-75 14345
demux-RTE-76 19513
demux-RTE-77 15146
demux-RTE-78 10533
demux-RTE-79 8645
demux-unknown 53

After demultiplexage : 715587 sequences, and per sample in average : 9293
87.02% of reads have been associated to a sample

4- Reads Trimming

Miseq 329


conda activate cutadapt

mkdir Analysis/Miseq329/Trimming
for i in Analysis/Miseq329/Demultiplex/*_R1.fastq.gz
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.fastq.gz/.fastq/')
cutadapt -q 10 -m 30 -a AGATCGGAAGAGC -o Analysis/Miseq329/Trimming/clean_"$sample" "$i"
done  

for i in Analysis/Miseq329/Demultiplex/*_R2.fastq.gz
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.fastq.gz/.fastq/')
cutadapt -u 5 -q 10 -m 30 -a AGATCGGAAGAGC -o Analysis/Miseq329/Trimming/clean_"$sample" "$i"
done  


conda activate fastqpair

for i in Analysis/Miseq329/Trimming/clean_demux-*_R1.fastq
do
sample=$(echo $i | sed -e 's/_R1.fastq//g')
fastq_pair "$sample"_R1.fastq "$sample"_R2.fastq
done


seq_tot=0
n=0
for file in Analysis/Miseq329/Trimming/*_R1.fastq.paired.fq
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_R1.fastq.*//g' | sort | uniq)
line=$(cat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en sortie de trimming, soit "$moy" séquences par échantillon en moyenne"  
sample count
clean_demux-SPY_201_291 91345
clean_demux-SPY_201_293 20653
clean_demux-SPY_201_294 41608
clean_demux-SPY_201_296 2412
clean_demux-SPY_201_297 65589
clean_demux-SPY_211_195 39543
clean_demux-SPY_211_196 29322
clean_demux-SPY_211_200 740
clean_demux-SPY_211_201 65698
clean_demux-SPY_211_203 61123
clean_demux-SPY_211_210 59812
clean_demux-SPY_211_212 59220
clean_demux-SPY_211_213 64227
clean_demux-unknown 921

After trimming: a total of 602213 sequences, and per sample in average : 43015 séquences

Miseq 330


conda activate cutadapt

mkdir Analysis/Miseq330/Trimming
for i in Analysis/Miseq330/Demultiplex/*_R1.fastq.gz
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.fastq.gz/.fastq/')
cutadapt -q 10 -m 30 -a AGATCGGAAGAGC -o Analysis/Miseq330/Trimming/clean_"$sample" "$i"
done  

for i in Analysis/Miseq330/Demultiplex/*_R2.fastq.gz
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.fastq.gz/.fastq/')
cutadapt -u 5 -q 10 -m 30 -a AGATCGGAAGAGC -o Analysis/Miseq330/Trimming/clean_"$sample" "$i"
done  


conda activate fastqpair

for i in Analysis/Miseq330/Trimming/clean_demux-*_R1.fastq
do
sample=$(echo $i | sed -e 's/_R1.fastq//g')
fastq_pair "$sample"_R1.fastq "$sample"_R2.fastq
done


seq_tot=0
n=0
for file in Analysis/Miseq330/Trimming/*_R1.fastq.paired.fq
do
sample=$(echo $file|awk -F "/" '{print $NF}'| sed -e 's/_R1.fastq.*//g' | sort | uniq)
line=$(cat $file|wc -l)
seq=$(echo $line/4|bc)
n=$((n+1))
seq_tot=$((seq_tot+seq))
echo -e "|"$sample"|"$seq"|"
done
moy=$((seq_tot/n))
echo "Il y a "$seq_tot" séquences en sortie de trimming, soit "$moy" séquences par échantillon en moyenne"  
sample count
clean_demux-RTE-01 9886
clean_demux-RTE-02 19169
clean_demux-RTE-03 7844
clean_demux-RTE-04 10960
clean_demux-RTE-05 5149
clean_demux-RTE-06 7024
clean_demux-RTE-07 4751
clean_demux-RTE-08 9700
clean_demux-RTE-09 6477
clean_demux-RTE-10 10553
clean_demux-RTE-11 6016
clean_demux-RTE-12 9848
clean_demux-RTE-13 7401
clean_demux-RTE-15 7469
clean_demux-RTE-16 7265
clean_demux-RTE-17 2974
clean_demux-RTE-18 9344
clean_demux-RTE-19 7608
clean_demux-RTE-20 8397
clean_demux-RTE-21 7270
clean_demux-RTE-22 4816
clean_demux-RTE-23 6675
clean_demux-RTE-24 4153
clean_demux-RTE-25 8690
clean_demux-RTE-26 8256
clean_demux-RTE-27 11351
clean_demux-RTE-28 4054
clean_demux-RTE-29 5365
clean_demux-RTE-30 5684
clean_demux-RTE-31 4694
clean_demux-RTE-32 4092
clean_demux-RTE-33 7913
clean_demux-RTE-34 8183
clean_demux-RTE-35 13237
clean_demux-RTE-36 9929
clean_demux-RTE-37 5675
clean_demux-RTE-38 4740
clean_demux-RTE-39 4127
clean_demux-RTE-40 12294
clean_demux-RTE-41 8184
clean_demux-RTE-42 5288
clean_demux-RTE-43 12019
clean_demux-RTE-44 14497
clean_demux-RTE-45 14824
clean_demux-RTE-46 10579
clean_demux-RTE-47 10408
clean_demux-RTE-48 10402
clean_demux-RTE-49 6758
clean_demux-RTE-50 13620
clean_demux-RTE-51 9573
clean_demux-RTE-52 10545
clean_demux-RTE-53 12475
clean_demux-RTE-54 10075
clean_demux-RTE-55 14647
clean_demux-RTE-56 5216
clean_demux-RTE-57 17324
clean_demux-RTE-59 9270
clean_demux-RTE-61 7300
clean_demux-RTE-62 11698
clean_demux-RTE-63 5458
clean_demux-RTE-64 6298
clean_demux-RTE-65 13996
clean_demux-RTE-66 28764
clean_demux-RTE-67 10445
clean_demux-RTE-68 12288
clean_demux-RTE-69 13223
clean_demux-RTE-70 8091
clean_demux-RTE-71 10297
clean_demux-RTE-72 9383
clean_demux-RTE-73 6686
clean_demux-RTE-74 8474
clean_demux-RTE-75 14345
clean_demux-RTE-76 19512
clean_demux-RTE-77 15144
clean_demux-RTE-78 10533
clean_demux-RTE-79 8644
clean_demux-unknown 53

After trimming: a total of 715369 sequences, per sample in average : 9290

5- Mapping

Miseq329


mkdir Analysis/Miseq329/Mapping/
conda activate mapping

Genome LR991693


bwa index Data/Reference/LR991693_Rana_temporia_mt.fasta

samtools faidx Data/Reference/LR991693_Rana_temporia_mt.fasta

samtools dict Data/Reference/LR991693_Rana_temporia_mt.fasta > Data/Reference/LR991693_Rana_temporia_mt.fasta.dict


mkdir Analysis/Miseq329/Mapping/LR991693


for i in Analysis/Miseq329/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/LR991693_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq329/Mapping/LR991693/mapped_${sample}.bam -
done  


for i in Analysis/Miseq329/Mapping/LR991693/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-SPY_201_291 51373 28.1203
mapped_clean_demux-SPY_201_293 7576 18.3412
mapped_clean_demux-SPY_201_294 54886 65.9561
mapped_clean_demux-SPY_201_296 2215 45.9163
mapped_clean_demux-SPY_201_297 116497 88.8083
mapped_clean_demux-SPY_211_195 30008 37.9435
mapped_clean_demux-SPY_211_196 31460 53.6457
mapped_clean_demux-SPY_211_200 639 43.1757
mapped_clean_demux-SPY_211_201 119755 91.1405
mapped_clean_demux-SPY_211_203 58653 47.9795
mapped_clean_demux-SPY_211_210 18030 15.0722
mapped_clean_demux-SPY_211_212 104645 88.3528
mapped_clean_demux-SPY_211_213 120267 93.6265
mapped_clean_demux-unknown 181 9.82628

Génome MT483700


bwa index Data/Reference/MT483700_Rana_temporia_mt.fasta

samtools faidx Data/Reference/MT483700_Rana_temporia_mt.fasta

samtools dict Data/Reference/MT483700_Rana_temporia_mt.fasta > Data/Reference/MT483700_Rana_temporia_mt.fasta.dict


mkdir Analysis/Miseq329/Mapping/MT483700


for i in Analysis/Miseq329/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/MT483700_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq329/Mapping/MT483700/mapped_${sample}.bam -
done  


for i in Analysis/Miseq329/Mapping/MT483700/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-SPY_201_291 50789 27.8006
mapped_clean_demux-SPY_201_293 7283 17.6318
mapped_clean_demux-SPY_201_294 53872 64.7376
mapped_clean_demux-SPY_201_296 2171 45.0041
mapped_clean_demux-SPY_201_297 115375 87.953
mapped_clean_demux-SPY_211_195 29595 37.4213
mapped_clean_demux-SPY_211_196 31230 53.2535
mapped_clean_demux-SPY_211_200 628 42.4324
mapped_clean_demux-SPY_211_201 117342 89.3041
mapped_clean_demux-SPY_211_203 57547 47.0748
mapped_clean_demux-SPY_211_210 17747 14.8357
mapped_clean_demux-SPY_211_212 103454 87.3472
mapped_clean_demux-SPY_211_213 118088 91.9302
mapped_clean_demux-unknown 181 9.82628

Genome NC_042226


bwa index Data/Reference/NC_042226_Rana_temporia_mt.fasta

samtools faidx Data/Reference/NC_042226_Rana_temporia_mt.fasta

samtools dict Data/Reference/NC_042226_Rana_temporia_mt.fasta > Data/Reference/NC_042226_Rana_temporia_mt.fasta.dict


mkdir Analysis/Miseq329/Mapping/NC_042226


for i in Analysis/Miseq329/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/NC_042226_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq329/Mapping/NC_042226/mapped_${sample}.bam -
done  


for i in Analysis/Miseq329/Mapping/NC_042226/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-SPY_201_291 43165 23.6275
mapped_clean_demux-SPY_201_293 6892 16.6852
mapped_clean_demux-SPY_201_294 45391 54.546
mapped_clean_demux-SPY_201_296 1915 39.6973
mapped_clean_demux-SPY_201_297 100132 76.3329
mapped_clean_demux-SPY_211_195 25873 32.715
mapped_clean_demux-SPY_211_196 23140 39.4584
mapped_clean_demux-SPY_211_200 414 27.973
mapped_clean_demux-SPY_211_201 109364 83.2324
mapped_clean_demux-SPY_211_203 46074 37.6896
mapped_clean_demux-SPY_211_210 14876 12.4356
mapped_clean_demux-SPY_211_212 78577 66.3433
mapped_clean_demux-SPY_211_213 102507 79.8006
mapped_clean_demux-unknown 165 8.95765

Miseq330


mkdir Analysis/Miseq330/Mapping/
conda activate mapping

Genome LR991693


mkdir Analysis/Miseq330/Mapping/LR991693


for i in Analysis/Miseq330/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/LR991693_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq330/Mapping/LR991693/mapped_${sample}.bam -
done  


for i in Analysis/Miseq330/Mapping/LR991693/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-RTE-01 17084 86.405
mapped_clean_demux-RTE-02 35460 92.4931
mapped_clean_demux-RTE-03 13242 84.4085
mapped_clean_demux-RTE-04 19916 90.8577
mapped_clean_demux-RTE-05 9328 90.5807
mapped_clean_demux-RTE-06 12375 88.0908
mapped_clean_demux-RTE-07 8216 86.466
mapped_clean_demux-RTE-08 16127 83.1289
mapped_clean_demux-RTE-09 11000 84.9159
mapped_clean_demux-RTE-10 17173 81.3655
mapped_clean_demux-RTE-11 8997 74.7756
mapped_clean_demux-RTE-12 17424 88.4647
mapped_clean_demux-RTE-13 12921 87.2923
mapped_clean_demux-RTE-15 13138 87.9502
mapped_clean_demux-RTE-16 13403 92.2436
mapped_clean_demux-RTE-17 4757 79.9765
mapped_clean_demux-RTE-18 16950 90.6999
mapped_clean_demux-RTE-19 12507 82.1964
mapped_clean_demux-RTE-20 14952 89.0318
mapped_clean_demux-RTE-21 13203 90.8047
mapped_clean_demux-RTE-22 8366 86.8563
mapped_clean_demux-RTE-23 12248 91.7453
mapped_clean_demux-RTE-24 7480 90.0554
mapped_clean_demux-RTE-25 15697 90.3165
mapped_clean_demux-RTE-26 15547 94.1558
mapped_clean_demux-RTE-27 21745 95.7845
mapped_clean_demux-RTE-28 7580 93.4879
mapped_clean_demux-RTE-29 9840 91.7055
mapped_clean_demux-RTE-30 10440 91.8367
mapped_clean_demux-RTE-31 8638 92.0111
mapped_clean_demux-RTE-32 6303 77.0161
mapped_clean_demux-RTE-33 13921 87.9628
mapped_clean_demux-RTE-34 14537 88.8244
mapped_clean_demux-RTE-35 23929 90.3868
mapped_clean_demux-RTE-36 17423 87.7379
mapped_clean_demux-RTE-37 10081 88.8194
mapped_clean_demux-RTE-38 8525 89.9262
mapped_clean_demux-RTE-39 7378 89.387
mapped_clean_demux-RTE-40 22975 93.4399
mapped_clean_demux-RTE-41 14466 88.3798
mapped_clean_demux-RTE-42 9333 88.247
mapped_clean_demux-RTE-43 22673 94.3215
mapped_clean_demux-RTE-44 27047 93.2848
mapped_clean_demux-RTE-45 28066 94.6641
mapped_clean_demux-RTE-46 19836 93.7518
mapped_clean_demux-RTE-47 19542 93.8797
mapped_clean_demux-RTE-48 19850 95.4143
mapped_clean_demux-RTE-49 12943 95.7606
mapped_clean_demux-RTE-50 26191 96.149
mapped_clean_demux-RTE-51 17936 93.6801
mapped_clean_demux-RTE-52 19884 94.2817
mapped_clean_demux-RTE-53 23454 94.004
mapped_clean_demux-RTE-54 18881 93.7022
mapped_clean_demux-RTE-55 27645 94.3709
mapped_clean_demux-RTE-56 9797 93.913
mapped_clean_demux-RTE-57 33278 96.0459
mapped_clean_demux-RTE-59 17791 95.9601
mapped_clean_demux-RTE-61 13903 95.226
mapped_clean_demux-RTE-62 22241 95.0633
mapped_clean_demux-RTE-63 10207 93.5049
mapped_clean_demux-RTE-64 11642 92.4262
mapped_clean_demux-RTE-65 26849 95.9167
mapped_clean_demux-RTE-66 55462 96.4087
mapped_clean_demux-RTE-67 20022 95.8449
mapped_clean_demux-RTE-68 23057 93.8192
mapped_clean_demux-RTE-69 25242 95.4473
mapped_clean_demux-RTE-70 15450 95.4765
mapped_clean_demux-RTE-71 19945 96.8486
mapped_clean_demux-RTE-72 18003 95.9341
mapped_clean_demux-RTE-73 12640 94.5259
mapped_clean_demux-RTE-74 16142 95.2443
mapped_clean_demux-RTE-75 26758 93.2659
mapped_clean_demux-RTE-76 37443 95.9486
mapped_clean_demux-RTE-77 28440 93.8986
mapped_clean_demux-RTE-78 20022 95.0441
mapped_clean_demux-RTE-79 16340 94.5164
mapped_clean_demux-unknown 78 73.5849

Genome MT483700


mkdir Analysis/Miseq330/Mapping/MT483700


for i in Analysis/Miseq330/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/MT483700_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq330/Mapping/MT483700/mapped_${sample}.bam -
done  


for i in Analysis/Miseq330/Mapping/MT483700/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-RTE-01 16853 85.2367
mapped_clean_demux-RTE-02 34952 91.168
mapped_clean_demux-RTE-03 13078 83.3631
mapped_clean_demux-RTE-04 19666 89.7172
mapped_clean_demux-RTE-05 9240 89.7262
mapped_clean_demux-RTE-06 12201 86.8522
mapped_clean_demux-RTE-07 8104 85.2873
mapped_clean_demux-RTE-08 15948 82.2062
mapped_clean_demux-RTE-09 10822 83.5418
mapped_clean_demux-RTE-10 16947 80.2947
mapped_clean_demux-RTE-11 8882 73.8198
mapped_clean_demux-RTE-12 17155 87.0989
mapped_clean_demux-RTE-13 12734 86.0289
mapped_clean_demux-RTE-15 12925 86.5243
mapped_clean_demux-RTE-16 13220 90.9842
mapped_clean_demux-RTE-17 4693 78.9005
mapped_clean_demux-RTE-18 16706 89.3943
mapped_clean_demux-RTE-19 12366 81.2697
mapped_clean_demux-RTE-20 14675 87.3824
mapped_clean_demux-RTE-21 13015 89.5117
mapped_clean_demux-RTE-22 8218 85.3198
mapped_clean_demux-RTE-23 12059 90.3296
mapped_clean_demux-RTE-24 7328 88.2254
mapped_clean_demux-RTE-25 15479 89.0621
mapped_clean_demux-RTE-26 15327 92.8234
mapped_clean_demux-RTE-27 21478 94.6084
mapped_clean_demux-RTE-28 7454 91.9339
mapped_clean_demux-RTE-29 9715 90.5405
mapped_clean_demux-RTE-30 10290 90.5172
mapped_clean_demux-RTE-31 8489 90.4239
mapped_clean_demux-RTE-32 6216 75.9531
mapped_clean_demux-RTE-33 13707 86.6106
mapped_clean_demux-RTE-34 14344 87.6451
mapped_clean_demux-RTE-35 23532 88.8872
mapped_clean_demux-RTE-36 17125 86.2373
mapped_clean_demux-RTE-37 9931 87.4978
mapped_clean_demux-RTE-38 8395 88.5549
mapped_clean_demux-RTE-39 7250 87.8362
mapped_clean_demux-RTE-40 22651 92.1222
mapped_clean_demux-RTE-41 14216 86.8524
mapped_clean_demux-RTE-42 9213 87.1123
mapped_clean_demux-RTE-43 22344 92.9528
mapped_clean_demux-RTE-44 26703 92.0984
mapped_clean_demux-RTE-45 27719 93.4937
mapped_clean_demux-RTE-46 19548 92.3906
mapped_clean_demux-RTE-47 19239 92.4241
mapped_clean_demux-RTE-48 19546 93.9531
mapped_clean_demux-RTE-49 12775 94.5176
mapped_clean_demux-RTE-50 25561 93.8363
mapped_clean_demux-RTE-51 17678 92.3326
mapped_clean_demux-RTE-52 19617 93.0156
mapped_clean_demux-RTE-53 22998 92.1764
mapped_clean_demux-RTE-54 18581 92.2134
mapped_clean_demux-RTE-55 27212 92.8927
mapped_clean_demux-RTE-56 9599 92.015
mapped_clean_demux-RTE-57 32853 94.8193
mapped_clean_demux-RTE-59 17544 94.6278
mapped_clean_demux-RTE-61 13733 94.0616
mapped_clean_demux-RTE-62 21930 93.734
mapped_clean_demux-RTE-63 10068 92.2316
mapped_clean_demux-RTE-64 11404 90.5367
mapped_clean_demux-RTE-65 26505 94.6878
mapped_clean_demux-RTE-66 54714 95.1085
mapped_clean_demux-RTE-67 19748 94.5333
mapped_clean_demux-RTE-68 22727 92.4764
mapped_clean_demux-RTE-69 24878 94.0709
mapped_clean_demux-RTE-70 15238 94.1664
mapped_clean_demux-RTE-71 19705 95.6832
mapped_clean_demux-RTE-72 17772 94.7032
mapped_clean_demux-RTE-73 12441 93.0377
mapped_clean_demux-RTE-74 15906 93.8518
mapped_clean_demux-RTE-75 26275 91.5824
mapped_clean_demux-RTE-76 36937 94.652
mapped_clean_demux-RTE-77 27941 92.2511
mapped_clean_demux-RTE-78 19678 93.4112
mapped_clean_demux-RTE-79 16062 92.9084
mapped_clean_demux-unknown 76 71.6981

Genome NC_042226


mkdir Analysis/Miseq330/Mapping/NC_042226


for i in Analysis/Miseq330/Trimming/*_R1.fastq.paired.fq
do
file=$(echo $i |sed 's/_R1.fastq.paired.fq//')
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/_R1.fastq.paired.fq//')
bwa mem -t 20 -R '@RG\tID:'${sample}'\tSM:'${sample}'\tPL:illunmina' Data/Reference/NC_042226_Rana_temporia_mt.fasta $i ${file}_R2.fastq.paired.fq |samtools sort -@ 10 -o Analysis/Miseq330/Mapping/NC_042226/mapped_${sample}.bam -
done  


for i in Analysis/Miseq330/Mapping/NC_042226/*.bam
do
sample=$(echo $i|awk -F "/" '{print $NF}' |sed 's/.bam//')
samtools flagstat $i | awk -F " " -v var1="$sample" '{if($0~/paired in sequencing/){total=$1};if($0~/with itself and mate mapped/){paired=$1};if($0~/singletons/){sing=$1}}END{sum=sing+paired;perc=sum/total*100;print "|"var1"|"sum"|"perc"|"}'
done  
sample mapped percent
mapped_clean_demux-RTE-01 14589 73.7862
mapped_clean_demux-RTE-02 31045 80.9771
mapped_clean_demux-RTE-03 11259 71.7682
mapped_clean_demux-RTE-04 17217 78.5447
mapped_clean_demux-RTE-05 8142 79.0639
mapped_clean_demux-RTE-06 10637 75.719
mapped_clean_demux-RTE-07 7101 74.7316
mapped_clean_demux-RTE-08 14183 73.1082
mapped_clean_demux-RTE-09 9686 74.7723
mapped_clean_demux-RTE-10 15208 72.0553
mapped_clean_demux-RTE-11 7966 66.2068
mapped_clean_demux-RTE-12 15545 78.9247
mapped_clean_demux-RTE-13 11468 77.476
mapped_clean_demux-RTE-15 11606 77.6945
mapped_clean_demux-RTE-16 12076 83.1108
mapped_clean_demux-RTE-17 4366 73.4028
mapped_clean_demux-RTE-18 14947 79.9818
mapped_clean_demux-RTE-19 11118 73.0678
mapped_clean_demux-RTE-20 13322 79.3259
mapped_clean_demux-RTE-21 11564 79.5323
mapped_clean_demux-RTE-22 7331 76.1109
mapped_clean_demux-RTE-23 10650 79.7753
mapped_clean_demux-RTE-24 6497 78.2206
mapped_clean_demux-RTE-25 13664 78.6191
mapped_clean_demux-RTE-26 13620 82.4855
mapped_clean_demux-RTE-27 19368 85.3141
mapped_clean_demux-RTE-28 6644 81.9438
mapped_clean_demux-RTE-29 8630 80.4287
mapped_clean_demux-RTE-30 9248 81.3512
mapped_clean_demux-RTE-31 7516 80.0597
mapped_clean_demux-RTE-32 5825 71.1755
mapped_clean_demux-RTE-33 12277 77.5749
mapped_clean_demux-RTE-34 13355 81.6021
mapped_clean_demux-RTE-35 21598 81.5819
mapped_clean_demux-RTE-36 15552 78.316
mapped_clean_demux-RTE-37 8877 78.2115
mapped_clean_demux-RTE-38 7805 82.3312
mapped_clean_demux-RTE-39 6516 78.9435
mapped_clean_demux-RTE-40 20141 81.9139
mapped_clean_demux-RTE-41 12388 75.6843
mapped_clean_demux-RTE-42 8232 77.8366
mapped_clean_demux-RTE-43 20024 83.3014
mapped_clean_demux-RTE-44 23843 82.2343
mapped_clean_demux-RTE-45 24361 82.1674
mapped_clean_demux-RTE-46 17282 81.6807
mapped_clean_demux-RTE-47 17134 82.3117
mapped_clean_demux-RTE-48 17116 82.2726
mapped_clean_demux-RTE-49 11624 86.0018
mapped_clean_demux-RTE-50 22380 82.1586
mapped_clean_demux-RTE-51 16046 83.8086
mapped_clean_demux-RTE-52 17729 84.0635
mapped_clean_demux-RTE-53 20306 81.3868
mapped_clean_demux-RTE-54 16375 81.2655
mapped_clean_demux-RTE-55 24071 82.1704
mapped_clean_demux-RTE-56 8696 83.3589
mapped_clean_demux-RTE-57 29237 84.3829
mapped_clean_demux-RTE-59 15770 85.0593
mapped_clean_demux-RTE-61 12293 84.1986
mapped_clean_demux-RTE-62 19602 83.7836
mapped_clean_demux-RTE-63 9010 82.5394
mapped_clean_demux-RTE-64 10191 80.9066
mapped_clean_demux-RTE-65 23615 84.3634
mapped_clean_demux-RTE-66 49178 85.4853
mapped_clean_demux-RTE-67 17855 85.4715
mapped_clean_demux-RTE-68 21331 86.7961
mapped_clean_demux-RTE-69 22351 84.5156
mapped_clean_demux-RTE-70 13808 85.3294
mapped_clean_demux-RTE-71 17708 85.9862
mapped_clean_demux-RTE-72 16005 85.2872
mapped_clean_demux-RTE-73 11049 82.6279
mapped_clean_demux-RTE-74 14239 84.0158
mapped_clean_demux-RTE-75 23767 82.8407
mapped_clean_demux-RTE-76 32627 83.6075
mapped_clean_demux-RTE-77 25210 83.2343
mapped_clean_demux-RTE-78 17683 83.9409
mapped_clean_demux-RTE-79 14384 83.2022
mapped_clean_demux-unknown 76 71.6981