Comparison of Bioinformatics Pipeline for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains from Yogyakarta and Central Java, Indonesia

Main Author: Bernard, Stefanus
Format: Thesis application/pdf
Bahasa: eng
Terbitan: Indonesia International Institute for Life Sciences , 2022
Subjects:
Online Access: http://repository.i3l.ac.id/jspui/handle/123456789/251
ctrlnum 123456789-251
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><title>Comparison of Bioinformatics Pipeline for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains from Yogyakarta and Central Java, Indonesia</title><creator>Bernard, Stefanus</creator><subject>SARS-CoV-2</subject><subject>Next Generation Sequencing</subject><subject>Illumina</subject><subject>Bioinformatics Pipeline</subject><description>Severe Acute Respiratory Syndrome-Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full length SARS-CoV-2 genome in January 10th 2020, with the hope to turn the table against worsening pandemic situation. Previous studies in respiratory viruses characterization require mapping of raw sequences to the human genome in downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines of improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on hybridization capture method capable of capturing targeted respiratory viruses including SARS-CoV-2. Therefore, allowing direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarked the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through utilization of different bioinformatics pipelines named as &#x2018;Fast Pipeline&#x2019; and &#x2018;Normal Pipeline&#x2019; to SARS-CoV-2 strains isolated from Yogyakarta and Central Java. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples including in the identification of major studied nucleotide substitutions and amino acids mutations. Certain limitations were identified in terms of pipeline algorithm whereas it is highly recommended in the future studies to design a pipeline in an integrated framework, for instance by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline.</description><date>2022-03-08T08:07:45Z</date><date>2022-03-08T08:07:45Z</date><date>2021-08-21</date><type>Thesis:Thesis</type><identifier>http://repository.i3l.ac.id/jspui/handle/123456789/251</identifier><language>eng</language><relation>BI 21-001;T202109032</relation><type>File:application/pdf</type><publisher>Indonesia International Institute for Life Sciences</publisher><recordID>123456789-251</recordID></dc>
language eng
format Thesis:Thesis
Thesis
File:application/pdf
File
author Bernard, Stefanus
title Comparison of Bioinformatics Pipeline for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains from Yogyakarta and Central Java, Indonesia
publisher Indonesia International Institute for Life Sciences
publishDate 2022
topic SARS-CoV-2
Next Generation Sequencing
Illumina
Bioinformatics Pipeline
url http://repository.i3l.ac.id/jspui/handle/123456789/251
contents Severe Acute Respiratory Syndrome-Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full length SARS-CoV-2 genome in January 10th 2020, with the hope to turn the table against worsening pandemic situation. Previous studies in respiratory viruses characterization require mapping of raw sequences to the human genome in downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines of improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on hybridization capture method capable of capturing targeted respiratory viruses including SARS-CoV-2. Therefore, allowing direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarked the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through utilization of different bioinformatics pipelines named as ‘Fast Pipeline’ and ‘Normal Pipeline’ to SARS-CoV-2 strains isolated from Yogyakarta and Central Java. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples including in the identification of major studied nucleotide substitutions and amino acids mutations. Certain limitations were identified in terms of pipeline algorithm whereas it is highly recommended in the future studies to design a pipeline in an integrated framework, for instance by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline.
id IOS17824.123456789-251
institution Indonesia International Institute for Life Sciences
institution_id 1894
institution_type library:university
library
library Perpustakaan Indonesia International Institute for Life Sciences (i3L)
library_id 1547
collection Scientific Repository I3L
repository_id 17824
subject_area Bio medicine
Food Sciences
Food Technology/Teknologi Pembuatan Makanan Komersial
Practical Pharmacy
city JAKARTA TIMUR
province DKI JAKARTA
shared_to_ipusnas_str 1
repoId IOS17824
first_indexed 2022-11-22T09:06:25Z
last_indexed 2022-11-22T09:06:25Z
recordtype dc
merged_child_boolean 1
_version_ 1750188725757804544
score 17.611206