youOverall solution of Huisuan Bioinformatics Platform
Huisuan Bio provides professional integrated software and hardware solutions for bioinformatics platforms for users of life science-related research institutions, gene sequencing companies, etc., with the aim of providing customers with worry-free backend support, so that scientific researchers and related companies can focus on their core business..
This overall solution provides customers with customized software and hardware system integration services, including servers, storage, networks, file systems, operating systems, cluster management software and other IT infrastructures; it also includes full genome and full displayThe construction of various diverse bioinformatics data analysis processes such as subs and transcriptomes; provide full system maintenance from software to hardware; provide "Huisuan Bioinformatics Cloud" services in the cloud; provide outsourcing for complex high-level customized data analysisService; and rely on the "Huisuan Bioinformatics Institute" to provide centralized or customized training services and examination certification; in addition, Huiisuan also provides relevant knowledge bases, databases, etc.
youHigh-performance computing and massive storage systems
Application scenario 1: Sequencer is equipped with high-performance computing and storage systems
Provides massive parallel file storage systems and appropriately-scale high-performance computing systems for Illumina large sequencers.Currently, a sequencing system with the highest flux (dual flow tank) can produce about 18T of base data in three days.
Gene sequencing and analysis generally include three major stages:
Stage 1: The Illumina sequencer collects data and processes it to generate the fastq format original file;
Stage 2: Software such as Tophat/BWA/Bowtie reads fastq format files and human reference genome indexes, and generates BAM format files through sequence comparison;
Stage 3: Software such as GATK/samtools or other gene analysis software such as gene structure variant detection software such as Manta/Varsand, CNVnator gene copy number variant detection software, etc., and finally the BAM file after analysis and processing is performed to generate VCFFormat file.
It requires PeB-level large-scale scale-out capabilities and parallel file system storage devices, as well as dozens of high-performance computing nodes.On the one hand, it meets the storage needs of a large number of fastq files in sequencers, and on the other hand, it meets the computing resources and storage resources requirements for biological information data analysis.
Application scenario 2: DOh novoAssembly analysis
De novo assembly analysis includes three stages:
Stage 1: The sequencer collects data and processes it to generate the fastq format original file;
Stage 2: Quality control is performed on fastq format files, and contig/scaffold splicing result files are generated through sequence splicing;
Stage 3: Glimmer and other prediction software predicts contig and functionally annotates the predicted genes.
youEstablish a bioinformatics analysis process
Gold standard analysis process for high-throughput sequencing data + customized analysis process
ComePre-installed three categories and a total of nine sets of gold-standard analysis processes for high-throughput sequencing data to meet the analysis needs of most sequencing projects
ComeProvide regular maintenance and upgrade of data analysis software and databases
ComeProvide remote guidance and on-site training for bioinformatics personnel
ComeProvide special analysis process customization and layout services
The genomic workflow integrates some analytical software and related bioinformatics databases to process the original sequence data (fastq) into variant (VCF) data.Each box represents an analysis module composed of integrated genomic analysis software, such as data quality control, sequence alignment, variant extraction and variant annotation analysis.These modules themselves can be used as independent workflows or can be connected to a larger workflow in logical relationships.
Genome workflow display
Highly integrate various biological software and databases