사용 통계

No 분석 파이프라인 이름 이용 횟수 설명 등록자
1 RNA_DEG_PIPELINE_USING_VOOM_VER_1 482 Quality checking Adaptive trimming Transcripts indexing BAM aligning BAM file replacing BAM sorting BAM indexing Statistics calculating Transcripts indexing Transcripts building Chromosome GTF file sorting_ Bed file coverting BAM Sorting Transcriptome coordinates translating BAM filtering Transcriptome aligning and calculating Making Matrix for limma voom Mean_variance estimating Precision weight generating hgh87
2 VARIANT_CALLING_PIPELINE_USING_SNPEFF_VER_1 414 Identifying genomic variants such as single nucleotide polymorphisms and DNA insertions and deletions can play an important role in scientific discovery_ To this end a pipeline has been developed to allow researchers at the CGSB to rapidly identify and annotate variants_ The pipeline employs the Genome Analysis Toolkit to perform variant calling and is based on the best practices for variant discovery analysis outlined by the Broad Institute_ Once SNPs have been identified_ SnpEff is utilized to annotate and predict the effects of the variants hgh87
3 RNA_DEG_PIPELINE_USING_CUFFPACKAGE_VER_1 380 Quality checking Adaptive trimming BAM aligning Transcripts indexing Redundancy removing BAM indexing Transcripts comparing Differential Expression hgh87
4 LAST_PIPELINE_VER_1 141 LastDB _ This program prepares sequences for subsequent comparison and alignment using lastal LastAL _ This program finds local alignments between query sequences_ and reference sequences that have been prepared using lastdb_ LastSplit _ This program estimates _split alignments_ _typically for DNA_ or _spliced alignments_ _typically for RNA__ If you run split module for whole genome maf file_ you have to run maf_swap module after the split module_ And then you have to run split module again for accuracy of alignment LastSwap _ This will align each rat base_pair to at most one cat base_pair_ but not necessarily vice_versa_ We can get 1_to_1 alignments by swapping the sequences and running last_split again_ If you run maf_swap module for whole genome maf file_ you have to run split module after the maf_swap module_ And then you have to run maf_swap module again LastConvert _ This script reads alignments in maf format_ and writes them in another format_ It can write them in these formats_ axt_ blast_ blasttab_ html_ psl_ sam_ tab_ _If it is protein data_ you have to choice psl hgh87
5 CLUSTALO_PIPELINE_VER_1 38 Clustao is a general purpose multiple sequence alignment program hgh87
6 INTERPROSCAN_PIPELINE_VER_1 34 Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database's signatures and the results are then output in a variety of formats hgh87
7 RNA_DEG_PIPELINE_USING_CUFFPACKAGE_VER_2 31 Quality checking Adaptive trimming BAM aligning Transcripts indexing Redundancy removing BAM indexing Transcripts comparing Differential Expression hgh87
8 HADOOP_BIG_BWA_MEM_PIPELINE_VER_1 29 Hadoop to boost the performance of the Burrows Wheeler Aligner. BWA works by seeding alignments with maximal exact matches MEMs and then extending seeds with the affine gap Smith Waterman algorithm SW hgh87
9 HADOOP_SPARK_BWA_MEM_PIPELINE_VER_1 26 SparkBWA is a tool that integrates the Burrows Wheeler Aligner_ BWA on a Apache Spark framework running on the top of Hadoop hgh87
10 EPIGENOME_ANALYSIS_PIPELINE_USING_HOMER_VER_2 21 Quality chek of a raw file using FastQC Filtering reads with low quality scores using fastq_quality_filter Quality check of a filtered files using FastQC Alignment using Bowtie Peak calling using MACS Annotation using annotatePeaks Visualization using make UCSCfile hgh87
11 METAGENOME_ANALYSIS_PIPELINE_USING_QIIME_VER_1 21 pre processing OTU clustering taxonomic assignment sequence alignment report taxonomic assignment result alpha diversity beta diversity hgh87
12 EPIGENOME_ANALYSIS_PIPELINE_USING_HOMER_VER_1 19 Quality chek of a raw file using FastQC Filtering reads with low quality scores using fastq_quality_filter Quality check of a filtered files using FastQC Alignment using Bowtie Peak calling using MACS Annotation using annotatePeaks Visualization using make UCSCfile hgh87
13 RNA_ANALYSIS_PIPELINE_USING_EMSAR_VER_1 16 Indexing Mapping_Non_EMSAR_ Quantify Sums the isoform_level esitmates to obtain gene_level expression level estimates Converts all the isoform_level fpkm files in a specified directory to gene_level fpkm_gfpkm_ files hgh87
14 RNA_DEG_PIPELINE_USING_VOOM_VER_2 16 Quality checking Adaptive trimming Transcripts indexing BAM aligning BAM file replacing BAM sorting BAM indexing Statistics calculating Transcripts indexing Transcripts building Chromosome GTF file sorting_ Bed file coverting BAM Sorting Transcriptome coordinates translating BAM filtering Transcriptome aligning and calculating Making Matrix for limma voom Mean_variance estimating Precision weight generating hgh87
15 GSA_PIPELINE_VER_1 15 At Ambry, Sanger gene sequencing is performed on specific regions of DNA that have been amplified by polymerase chain reaction (PCR). Double stranded sequencing occurs in both sense and antisense directions to detect sequence variations. For Specific Site Analysis, specific region(s) of DNA is/are amplified by PCR and sequenced. Sanger sequencing is performed for any regions missing or with insufficient read depth coverage for reliable heterozygous variant detection. Suspect variant calls other than "likely benign" or "benign" are verified by Sanger sequencing hgh87
16 HADOOP_BAM_PIPELINE_VER_1 12 Hadoop BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK and command line tools similar to SAMtools Cat is concatenation of partial SAM and BAM files Fixmate algorism has BAM and SAM mate information fixing Index algorism is indexing BAM file Sort algorism does sorting and merging BAM or SAM file hgh87
17 MUSCLE_PIPELINE_VER_1 12 MUSCLE is one of the best_performing multiple alignment programs according to published benchmark tests hgh87
18 HADOOP_BLASTP_PIPELINE_VER_1 12 An algorithm for comparing primary biological sequence information such as the amino acid sequences of different proteins or the nucleotides of DNA sequences based on hadoop hgh87
19 NGSGD_PIPELINE_VER_1 12 Gender determination tool for NGS data hgh87
20 EXOME_ANALYSIS_PIPELINE_USING_GATK_VER_1 10 Initial QC check Trim adapter sequence Remove poor quality reads Initial QC check Identify paired Unpaired reads Alignment Merge samfiles Add read groups Sorting Removing duplicates Fixmate Table reclaibration Local realignment Summary hgh87
21 EXOME_ANALYSIS_PIPELINE_USING_GATK_VER_2 10 Initial QC check Trim adapter sequence Remove poor quality reads Initial QC check Identify paired Unpaired reads Alignment Merge samfiles Add read groups Sorting Removing duplicates Fixmate Table reclaibration Local realignment Summary hgh87
22 VARIANT_CALLING_PIPELINE_USING_SNPEFF_VER_2 8 Identifying genomic variants such as single nucleotide polymorphisms and DNA insertions and deletions can play an important role in scientific discovery_ To this end a pipeline has been developed to allow researchers at the CGSB to rapidly identify and annotate variants_ The pipeline employs the Genome Analysis Toolkit to perform variant calling and is based on the best practices for variant discovery analysis outlined by the Broad Institute_ Once SNPs have been identified_ SnpEff is utilized to annotate and predict the effects of the variants hgh87
No 분석 프로그램 이용 횟수 설명 형태 등록자
1 RNA Sequencing Software Tools 85 Parameterize samtools properly (Alignment를 마친 BAM File 을 reference 순서와 같도록 chromosome 별로 정렬 하는 과정) LINUX hgh87
2 Metagenomic Sequencing Software Tools 22 The OTU picking step assigns similar sequences to operational taxonomic units, or OTUs, by clustering sequences based on a user-defined similarity threshold (임계값에 따라 시퀀스를 클러스터링하여 OTU에 유사한 시퀀스를 할당하는 과정) LINUX hgh87
3 Metagenomic Sequencing Software Tools 22 Split libraries according to barcodes specified in mapping file (매핑 파일에 지정된 바코드에 따라 라이브러리를 분할하는 과정) LINUX hgh87
4 Whole-genome Sequencing Software Tools 25 Generates a BAM index ".bai" file. This tool creates an index file for the input BAM that allows fast look-up of data in a BAM file, lke an index on a database. (이 도구는 데이터베이스의 인덱스와 같은 BAM 파일의 데이터를 빠르게 검색 할 수 있도록 입력 BAM에 대한 인덱스 파일을 만드는 과정) LINUX hgh87
5 RNA Sequencing Software Tools 138 bowtie2-build builds a Bowtie index from a set of DNA sequences. bowtie2-build outputs a set of 6 files with suffixes .1.bt2, .2.bt2, .3.bt2, .4.bt2, .rev.1.bt2, and .rev.2.bt2. (If the total length of all the input sequences is greater than about 4 billion, then the index files will end in ebwtl instead of ebwt.) (transcriptome.fa file을 이용하여 easy computation file 인 index file(bt2 file) 생성하는 과정) LINUX hgh87
6 Whole-genome Sequencing Software Tools 1 Sorts a BAM file. This tool sorts the input BAM file by coordinate, queryname (QNAME), or some other property of the SAM record (Bam 파일을 Sort 하는 과정) LINUX hgh87
7 GWAS Software Tools 11 gender determination tool for NGS data (NGS 데이터를 이용하여 성별을 판별하는 과정) LINUX hgh87
8 RNA Sequencing Software Tools 87 Cufflinks is both the name of a suite of tools and a program within that suite. Cufflinks the program assembles transcriptomes from RNA-Seq data and quantifies their expression (일련의 기록들의 abundances를 계산하는 과정) LINUX hgh87
9 RNA Sequencing Software Tools 19 Process of calculating TPM value about sample using the 'gfpkm', 'gene read count' and 'gen' ('gfpkm', 'gene read count', 'gene'을 이용하여 sample에 대한 TPM값을 계산하는 과정) LINUX hgh87
10 Whole-genome Sequencing Software Tools 23 A simple perl program that allows the user to compare QC filtered fastq files (paired와 unpaired를 확인하고, 각 필터에 paired-end를 기록하는 과정) LINUX hgh87
11 Metagenomic Sequencing Software Tools 33 Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database's signatures and the results are then output in a variety of formats (fasta 포맷의 Sequences데이터를 입력 받아 scanning 알고리즘을 실행하여 데이터베이스의 모든 signatures에 대해 계산한 테이블을 만드는 과정) LINUX hgh87
12 RNA Sequencing Software Tools 19 Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour (sam file을 생성하는 과정) LINUX hgh87
13 RNA Sequencing Software Tools 85 Aligns input reads against a reference transcriptome with Bowtie and calculates expression values using the alignments (Filtering한 BAM file을 이용하여 reference에 따라 read를 count 하는 quantification 하는 과정) LINUX hgh87
14 Etc Tools 1 The process of splitting the fasta file (fasta 파일을 split 하는 과정) LINUX hgh87
15 Comparative Genomic Software Tools 31 Clustao is a general purpose multiple sequence alignment program (다목적 다중 서열 정렬 하는 과정) LINUX hgh87
16 ChIP Sequencing Software Tools 48 All-in-one program for performing peak annotation (Peak calling의 결과인 peak의 genomic feature를 확인하는 과정) LINUX hgh87
17 Whole-genome Sequencing Software Tools 1 Trimmomatic is a fast, multithreaded command line tool that can be used to trim and crop Illumina (FASTQ) data as well as to remove adapters. The paired end mode will maintain correspondence of read pairs and also use the additional information contained in paired reads to better find adapter or PCR primer fragments introduced by the library preparation process (paired-end read들에서 adpater를 제거하고, illumina fastq 데이터를 자르는 과정) LINUX hgh87
18 Whole-genome Sequencing Software Tools 12 Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. Index algorism is indexing BAM file HADOOP hgh87
19 Whole-genome Sequencing Software Tools 25 This tool generates plots for visualizing the quality of a recalibration run (보정 실행의 품질을 시각화 하기 위한 플롯을 생성하는 과정) LINUX hgh87
20 Metagenomic Sequencing Software Tools 22 The number of times an OTU is found in each sample, and adds the taxonomic predictions for each OTU in the last column if a taxonomy file is supplied. (마지막열에 각 OTU에 대한 분류학적 예측을 추가하여 OTU table을 만드는 과정) LINUX hgh87
21 RNA Sequencing Software Tools 19 Process of using the Transcriptome.fa file make easy computation file it is an index file(.rsh file) (Transcriptome.fa file을 이용하여 easy computation file 인 index file(.rsh file) 생성하는 과정) LINUX hgh87
22 Whole-genome Sequencing Software Tools 23 Detect systematic errors in base quality scores (기본 품질 점수의 체계적인 오류 발견 하는 과정) LINUX hgh87
23 Whole-genome Sequencing Software Tools 23 Merges multiple less than 6 SAM and/or BAM files into a single file. This tool is used for combining SAM and/or BAM files from different runs or read groups, similarly to the "merge" function of Samtools (6개 이하의 SAM 또는 BAM 파일을 병합하는 과정) LINUX hgh87
24 Whole-genome Sequencing Software Tools 48 Index reference sequence in the FASTA format or extract subsequence from indexed reference sequence. If no region is specified, faidx will index the file and create .fai on the disk. If regions are specified, the subsequences will be retrieved and printed to stdout in the FASTA format (서브 압축을 풀어 .fai를 만드는 과정) LINUX hgh87
25 Whole-genome Sequencing Software Tools 71 Filters sequences based on quality (low quality sequence를 filtering 하기 위한 과정) LINUX hgh87
26 RNA Sequencing Software Tools 87 Comparing expression levels of genes and transcripts in RNA-Seq experiments is a hard problem. Cuffdiff is a highly accurate tool for performing these comparisons, and can tell you not only which genes are up- or down-regulated between two or more conditions, but also which genes are differentially spliced or are undergoing other types of isoform-level regulation (Differential expression 하는 과정) LINUX hgh87
27 RNA Sequencing Software Tools 84 Transcriptome assembly and differential expression analysis for RNA-Seq (transcriptome assembly 및 RNA-Seq에 대한 미분 발현 분석 과정) LINUX hgh87
28 Whole-genome Sequencing Software Tools 24 Produces a summary of alignment metrics from a SAM or BAM file (SAM 또는 BAM 파일에서 정렬 메티릭을 요약하는 과정) LINUX hgh87
29 Whole-genome Sequencing Software Tools 173 Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads and also determines when the quality is sufficiently high enough to trim the 5'-end of reads (quality 낮은 reads와 adaptor를 제거 후, R1과 R2의 pair를 맞춰서 공통 서열만 얻는 과정) LINUX hgh87
30 Whole-genome Sequencing Software Tools 48 Write out sequence read data (for filtering, merging, subsetting etc) (시퀀스 읽기 데이터 쓰기 (필터링, 병합, 서브 셋팅 등) 하는 과정) LINUX hgh87
31 Bisulfite Sequencing Software tools 34 LAST_lastdb, LAST_lastal, LAST_split, LAST_maf-swap, LAST_maf-convert LINUX hgh87
32 ChIP Sequencing Software Tools 48 Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer (histone enriched regions을 찾는 과정) LINUX hgh87
33 Whole-genome Sequencing Software Tools 25 A VCF containing many samples and/or variants will need to be subset in order to facilitate certain analyses (vcf파일을 특정 분석을 용이하게 하기 위하여 subset하는 과정) LINUX hgh87
34 Whole-genome Sequencing Software Tools 23 Find the SA coordinates of the input reads. Maximum maxSeedDiff differences are allowed in the first seedLen subsequence and maximum maxDiff differences are allowed in the whole sequence (SA 좌표를 찾는 과정) LINUX hgh87
35 Whole-genome Sequencing Software Tools 23 Generate alignments in the SAM format given single-end reads. Repetitive hits will be randomly chosen (single-end reads로 SAM 형식으로 정렬하는 과정) LINUX hgh87
36 Metagenomic Sequencing Software Tools 23 Aligns the sequences in a FASTA file to each other or to a template sequence alignment, depending on the method chosen. (fasta 파일의 서열을 템플레이트 서열 정렬에 정렬 시키는 과정) LINUX hgh87
37 Etc Tools 86 Process of sorting chromosome gtf file and to convert bed file to gtf file (gtf 파일을 chromosome 순으로 정렬한 다음 gtf 파일을 bed 파일로 변환하는 과정) LINUX hgh87
38 Whole-genome Sequencing Software Tools 23 Assess sequence coverage by a wide array of metrics, partitioned by sample, read group, or library (샘플, 읽기 그룹 또는 라이브러리별로 파티셔닝 된 다양한 메트릭으로 시퀀스 범위를 평가하는 과정) LINUX hgh87
39 Metagenomic Sequencing Software Tools 22 Produces this tree from a multiple sequence alignment. (multiple sequence 정렬에서 트리를 생성하는 과정) LINUX hgh87
40 RNA Sequencing Software Tools 19 Process of making stats file to express GC content, isoform information of transcript (transcript의 GC content, isoform 정보 등을 나타내는 stats file 만드는 과정) LINUX hgh87
41 Whole-genome Sequencing Software Tools 25 SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of variants on genes (such as amino acid changes). (variant annotation 및 예측을 위한 과정, 유전자에 대한 변이형을 예측하고 표현하는 과정) LINUX hgh87
42 Whole-genome Sequencing Software Tools 48 Define intervals to target for local realignment (지역 재편성을위한 목표 간격 설정하는 과정) LINUX hgh87
43 Metagenomic Sequencing Software Tools 22 perform beta diversity, principal coordinate analysis, and generate a preferences file along with 3D PCoA Plots.(배타 다양성과 주요 좌표 분석을 수행하고 3D PCoA 플롯을 생성하는 과정) LINUX hgh87
44 Whole-genome Sequencing Software Tools 27 SparkBWA MEM is a tool that integrates the Burrows-Wheeler Aligner--BWA on a Apache Spark framework running on the top of Hadoop HADOOP hgh87
45 Whole-genome Sequencing Software Tools 48 Creates a sequence dictionary for a reference sequence. This tool creates a sequence dictionary file (with ".dict" extension) from a reference sequence provided in FASTA format, which is required by many processing and analysis tools. The output file contains a header but no SAMRecords, and the header contains only sequence records (Sequence Dictionary를 만드는 과정) LINUX hgh87
46 Whole-genome Sequencing Software Tools 42 one wants to measure the genome wide coverage of a feature file. (Bam 파일의 genome 범위를 측정하는 과정) LINUX hgh87
47 Etc Tools 19 Process of using the gene id and transcript id creating g2t file (gene id와 그에 맞는 transcript id를 이용하여 g2t file을 만드는 과정) LINUX hgh87
48 Whole-genome Sequencing Software Tools 108 Does a full pass through the input file to calculate and print statistics to stdout (전체 과정을 계산하고, print statistics 하는 과정) LINUX hgh87
49 Whole-genome Sequencing Software Tools 25 Sorts a SAM file. This tool sorts the input SAM file by coordinate, queryname (QNAME), or some other property of the SAM record (Sam 파일을 Sort 하는 과정) LINUX hgh87
50 Whole-genome Sequencing Software Tools 133 Replace read groups in a BAM file.This tool enables the user to replace all read groups in the INPUT file with a single new read group and assign all reads to this read group in the OUTPUT BAM file (Bam file에 read groups를 교체하는 과정) LINUX hgh87
51 Whole-genome Sequencing Software Tools 23 Process of removing some of the base sequence artifacts(일부 염기 서열 artifacts를 제거 하는 과정) LINUX hgh87
52 RNA Sequencing Software Tools 85 Translate from genome to transcriptome coordinates (Chromosome순으로 정렬된 BAM file에 transcriptome을 annotation 하는 과정) LINUX hgh87
53 Whole-genome Sequencing Software Tools 12 Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. Fixmate algorism has BAM and SAM mate information fixing HADOOP hgh87
54 RNA Sequencing Software Tools 85 MapSplice is a software for mapping RNA-seq data to reference genome for splice junction discovery that depends only on reference genome, and not on any further annotations (각각의(R1, R2) read들을 reference genome에 Mapsplice2로 mapping 하는 과정) LINUX hgh87
55 RNA Sequencing Software Tools 85 Buidling references from a set of transcripts (Transcripts의 set으로부터 references 구축, RSEM을 위한 reference sequence를 준비하는 과정) LINUX hgh87
56 Whole-genome Sequencing Software Tools 25 The Genome Analysis Toolkit or GATK is a software package for analysis of high-throughput sequencing data, developed by the Data Science and Data Engineering group at the Broad Institute (2개의 데이터베이스로 시퀀싱 데이터를 변형, 분석하는 과정) LINUX hgh87
57 Whole-genome Sequencing Software Tools 25 whenever the program encounters a region showing signs of variation, it discards the existing mapping information and completely reassembles the reads in that region (변형의 징조를 나타내는 영역을 만날 때마다 기존 매핑 정보를 삭제하고, 해당 영역의 reads를 완전히 재조립 하는 과정) LINUX hgh87
58 Metagenomic Sequencing Software Tools 22 After picking OTUs, you can then pick a representative set of sequences. (대표적인 일련의 시퀀스를 얻는 과정) LINUX hgh87
59 Etc Tools 85 Process of Making Matrix for limma voom(Enter 10 or fewer isoforms.result file) LINUX hgh87
60 Whole-genome Sequencing Software Tools 195 Index a coordinate-sorted BAM or CRAM file for fast random access (BAM file에 random하게 접근할 때, performance를 높여주기 위한 indexing 과정) LINUX hgh87
61 RNA Sequencing Software Tools 116 bowtie-build builds a Bowtie index from a set of DNA sequences. bowtie-build outputs a set of 6 files with suffixes .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. (If the total length of all the input sequences is greater than about 4 billion, then the index files will end in ebwtl instead of ebwt.) (transcriptome.fa file을 이용하여 easy computation file 인 index file(ebwt file) 생성하는 과정) LINUX hgh87
62 Whole-genome Sequencing Software Tools 28 Index database sequences in the FASTA format (fasta format으로 indexing 하는 과정) LINUX hgh87
63 Whole-genome Sequencing Software Tools 87 Remove potential PCR duplicates- if multiple read pairs have identical external coordinates, only retain the pair with highest mapping quality. In the paired-end mode, this command ONLY works with FR orientation and requires ISIZE is correctly set. It does not work for unpaired reads (potential PCR 중복을 제거하는 과정) LINUX hgh87
64 Metagenomic Sequencing Software Tools 22 Assign taxonomy to each sequence. (각 시퀀스에 taxonomy를 할당하는 과정) LINUX hgh87
65 Metagenomic Sequencing Software Tools 22 Filter sequence alignment by removing highly variable regions. (가변적인 영역을 제거함으로써 시퀀스를 정렬하는 과정) LINUX hgh87
66 Whole-genome Sequencing Software Tools 86 Sort alignments by leftmost coordinates, or by read name when -n is used. An appropriate @HD-SO sort order header tag will be added or an existing one updated if necessary (read 들을 genomic location별로 정렬 하는 과정) LINUX hgh87
67 RNA Sequencing Software Tools 85 Filter reads from a paired end SAM or BAM file (only outputs paired reads) (Indels, inserts가 크게 된 것 그리고 mapping이 잘 안 된 것은 제거하는 과정) LINUX hgh87
68 Etc Tools 23 Group generic methods can be defined for four pre-specified groups of functions, Math, Ops, Summary and Complex. (There are no objects of these names in base R, but there are in the methods package.) (R로 요약하여, 하나의 Table로 만드는 과정) LINUX hgh87
69 Whole-genome Sequencing Software Tools 23 Verify mate-pair information between mates and fix if needed.This tool ensures that all mate-pair information is in sync between each read and its mate pair (mate와 fix 사이에서 필요한 정보를 확인하고 수정하는 과정) LINUX hgh87
70 Comparative Genomic Software Tools 12 MUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests (빠른 속도로 정확하게 시퀀스를 다중 정렬하는 과정) LINUX hgh87
71 Whole-genome Sequencing Software Tools 19 With no options or regions specified, prints all alignments in the specified input alignment file (in SAM, BAM, or CRAM format) to standard output in SAM format (with no header) (binary format인 bam file로 변환하는 과정) LINUX hgh87
72 RNA Sequencing Software Tools 88 TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons (Reference를 이용하여 tophat로 mapping하는 과정) LINUX hgh87
73 ChIP Sequencing Software Tools 48 The UCSC Genome Browser is quite possibly one of the best computational tools ever developed. Not only does it contain an incredible amount of data in a single application, it allows users to upload custom information such as data from their ChIP-Seq experiments so that they can be easily visualized and compared to other information (bedgraph format file을 만드는 과정) LINUX hgh87
74 Metagenomic Sequencing Software Tools 23 Checks user’s metadata mapping file for required data, valid format. (메타 데이터 mapping 파일에서 유효한 형식인지 파악하는 과정) LINUX hgh87
75 Whole-genome Sequencing Software Tools 23 Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads (reads로 부터 sequences를 trim adapter하는 과정) LINUX hgh87
76 Whole-genome Sequencing Software Tools 48 Perform local realignment of reads around indels (indels 주위에서 읽기의 로컬 재배치 수행하는 과정) LINUX hgh87
77 RNA Sequencing Software Tools 49 Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour (sam file을 생성하는 과정) LINUX hgh87
78 Whole-genome Sequencing Software Tools 25 This tool is designed for hard-filtering variant calls based on certain criteria (by using JEXL queries) (JEXL 쿼리를 사용하여 필터링 하는 과정) LINUX hgh87
79 Whole-genome Sequencing Software Tools 12 Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. Cat is concatenation of partial SAM and BAM files HADOOP hgh87
80 Metagenomic Sequencing Software Tools 22 The number of times an OTU is found in each sample, and adds the taxonomic predictions for each OTU in the last column if a taxonomy file is supplied. (마지막열에 각 OTU에 대한 분류학적 예측을 추가하여 OTU table을 만드는 과정) LINUX hgh87
81 Whole-genome Sequencing Software Tools 48 This tool provides useful metrics for validating library construction including the insert size distribution and read orientation of paired-end libraries (insert size distribution 의 통계를 제공하고, paired-end 라이브러리 방향을 제공하는 과정) LINUX hgh87
82 ChIP Sequencing Software Tools 48 To facilitate the analysis of ChIP-Seq (or any other type of short read re-sequencing data), it is useful to first transform the sequence alignment into platform independent data structure representing the experiment, analogous to loading the data into a database (Genome상에 있는 모든 위치의 ChIP-fragment density를 보여주는 과정) LINUX hgh87
83 Whole-genome Sequencing Software Tools 158 FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis (Raw data(Fastq file)의 sequencing이 잘 되었는지 quality check 하는 과정) LINUX hgh87
84 Metagenomic Sequencing Software Tools 22 Generate rarefied OTU tables, compute alpha diversity metrics for each rarefied OTU table collate alpha diversity results and generate alpha rarefaction plots.(rarefied OTU 테이블을 생성하고, 알파 diversity 메트릭스를 계산하여 결과에 대한 대조를 통해 알파 플롯을 생성하는 과정) LINUX hgh87
85 Whole-genome Sequencing Software Tools 24 Align 70bp-1Mbp query sequences with the BWA-MEM algorithm. Briefly, the algorithm works by seeding alignments with maximal exact matches (MEMs) and then extending seeds with the affine-gap Smith-Waterman algorithm (SW). (정렬을 최대 완전 일치(MEM)로 한 다음 affine-gap으로 시드를 확장하는 과정) LINUX hgh87
86 Whole-genome Sequencing Software Tools 1 Index a coordinate-sorted BAM or CRAM file for fast random access (SAM file에 random하게 접근할 때, performance를 높여주기 위한 indexing 과정) LINUX hgh87
87 GSA Software tools 13 At Ambry, Sanger gene sequencing is performed on specific regions of DNA that have been amplified by polymerase chain reaction (PCR). Double stranded sequencing occurs in both sense and antisense directions to detect sequence variations. For Specific Site Analysis, specific region(s) of DNA is/are amplified by PCR and sequenced. Sanger sequencing is performed for any regions missing or with insufficient read depth coverage for reliable heterozygous variant detection. Suspect variant calls other than "likely benign" or "benign" are verified by Sanger sequencing (RNA-Sequencing 데이터를 실험군과 대조군과의 발현의 양을 비교하여 차이가 나는 생물학적 기능을 밝혀 주는 과정) LINUX hgh87
88 Whole-genome Sequencing Software Tools 29 Hadoop to boost the performance of the Burrows-Wheeler Aligner (BWA - works by seeding alignments with maximal exact matches (MEMs) and then extending seeds with the affine-gap Smith-Waterman algorithm (SW)). (Hadoop 기반으로 BWA의 affine-gap Swmith-Watemant 알고리즘으로 시드를 확장하여 정렬하는 과정) HADOOP hgh87
89 Whole-genome Sequencing Software Tools 23 Generate alignments in the SAM format given paired-end reads. Repetitive read pairs will be placed randomly (paired-end reads로 SAM 형식으로 정렬하는 과정) LINUX hgh87
90 Whole-genome Sequencing Software Tools 48 Identifies duplicate reads. This tool locates and tags duplicate reads (both PCR and optical/sequencing-driven) in a BAM or SAM file, where duplicate reads are defined as originating from the same original fragment of DNA (중복 read를 식별하는 과정) LINUX hgh87
91 Whole-genome Sequencing Software Tools 12 Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework with the Picard SAM JDK, and command line tools similar to SAMtools. Sort algorism does sorting and merging BAM or SAM file HADOOP hgh87
92 Metagenomic Sequencing Software Tools 22 Summarize OTU by Category (optional, pass -c); Summarize Taxonomy; and Plot Taxonomy Summary.(OTU를 범주별로 요약하는 과정) LINUX hgh87
93 RNA Sequencing Software Tools 85 The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline (mutation vs. wild type의 차이의 관계를 나타내는 design matrix를 생성하고, design matrix를 이용하여 voom transformation을 통해 quantile normalization을 하는 과정) LINUX hgh87