2016-06-30 2 views
1

awkを使用してファイルから情報を抽出しようとしています。AWK - 別のファイルを使用して情報を抽出中 - 構文エラー

informationfile.txtは次のようになります

>ENST00000342992.10 cdna:known chromosome:GRCh38:2:178525989:178807421:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000460472.6 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000589042.5 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000591111.5 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000425332.2 cdna:known chromosome:GRCh38:2:178663627:178667307:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000448510.2 cdna:known chromosome:GRCh38:2:178669625:178672418:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000360870.9 cdna:known chromosome:GRCh38:2:178744405:178807421:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000634225.1 cdna:known chromosome:GRCh38:2:178753361:178767825:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000436599.1 cdna:known chromosome:GRCh38:2:178786089:178794954:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
>ENST00000470257.1 cdna:known chromosome:GRCh38:2:178798495:178807408:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:retained_intron gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
>ENST00000412264.1 cdna:known chromosome:GRCh38:2:178802287:178830802:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 
>ENST00000359218.9 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403] 
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC 
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA 
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC 
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC 
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA 
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT 
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC 
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC 
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT 
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG 
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG 
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC 
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC 
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC 
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA 
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG 
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC 
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT 
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC 
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT 
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA 
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC 
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT 
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC 
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT 
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT 
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA 
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC 
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT 
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT 
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA 
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT 
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA 
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG 
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT 

headerlist.txtファイルが正確に次のようになります。

ENST00000342992.10 
ENST00000460472.6 
ENST00000589042.5 
ENST00000591111.5 
ENST00000359218.9 
ENST00000615779.4 
ENST00000342175.10 

私は私がターゲットとしたいヘッダを集めawkのコードを書かれている、とそのヘッダーを次のヘッダーまで次の情報とともに収集します。


私はそれを呼び出す:それは生成する必要があり

#!/bin/awk      
NR == FNR {tags[$1]; next;} 
for (i in tags) { if (i ~ $0) {a=1; print; next;}} 
/>/ {a=0} 
a 

以下
awk -f myScript.txt <headerlist.txt> <informationfile.txt> 

はコードである

>Target Header 
Information attached to header 
. 
. 
. 

しかし、私は構文エラーを取得しています情報なし。矢印は空白スペースだけを指していません。

^ Syntax Error 

どうすればこの問題を修正できますか?

+1

中括弧の内側に 'for(i in tags) 'を移動します。 – karakfa

+0

実行中です。 :) - ただし、出力は生成されません。 –

+1

まあ、構文エラーを修正します。あなたのスクリプトには他にも問題があります。たとえば、 'a'の意図した使い方は何ですか? – karakfa

答えて

1

入力

$ cat HeaderList 
Target Header 
SomeOther Header 

$ cat InfoFile 
>Generic Header 
Information attached to header 
. 
. 
. 
>Target Header 
Information attached to header 
. 
. 
. 
>SomeOther Header 
Information attached to header 
. 
. 
. 

スクリプト

while read line 
    do 
    awk 'BEGIN{RS="\n>"}/'"$line"'/{printf ">%s\n",$0}' InfoFile 
    done <HeaderList 

出力

>Target Header 
Information attached to header 
. 
. 
. 
>SomeOther Header 
Information attached to header 
. 
. 
. 
+0

複数の "ターゲットヘッダー"を使用するにはどうすればよいですか? –

+1

@NicholasHayden:更新されました。 – sjsam

+0

ありがとう!出来た。どうもありがとうございます! –

1

私はこのワットを考えます

$ awk 'NR==FNR{h[$0]; next} 
     $0 in h{c=2} 
     c&&c--' headers file 

>Target Header 
Information attached to header 

ヘッダーが完全に同じ場合は、等価チェック($ 0 in h)と一致させて2行を出力できます。

新しいファイルのレイアウトでは、次のヘッダー

$ awk 'NR==FNR{h[$0]; next} 
      /^>/{p=0} 
     $0 in h{p=1} 
       p' headers file 

>Target Header 
Information attached to header 
. 
. 
. 

まで印刷したい場合は、このスクリプトは限り間の空白があるような

$ awk 'NR==FNR{h[">"$0]; next} 
      /^>/{p=0} 
     $1 in h{p=1} 
       p' headers file 

のように変更する必要がありますキー(ヘッダーファイルで使用されています)とそれ以外のレコードが残っています。今、ヘッダーは ">"プレフィックスを持たないでしょう。

+0

コードを実行しましたが、出力が生成されません。あなたは平等チェックで詳しく説明できますか? –

+1

あなたの入力ファイルを正確に使用し、 "> Target Header"を含むヘッダーファイルを使用しました。これは、ファイル内の行が 'h'(ヘッダファイルから読み込まれたもの)にあるかどうかをチェックします。テストするには質問からのテキストをファイルにコピーし、 'echo"> Target Header "> headers'でヘッダーファイルを作成し、' awk'スクリプトを実行します。 – karakfa

+0

あなたの出力を複製しようとしていますが、成功しません。私は提供された入力を使用し、あなたのコードに正確に挑戦しました。 –

関連する問題