7.Error Code

7-1.Summary

Error ID Description
Err001 Octopus-toolkit cannot access the web page.
Err002 Incorrect GEO accession number.
Err003 The experiment type cannot be handled with Octopus-toolkit.
Err004 The data cannot be processed.
Err005 Not enough disk space.
Err006 Related to each processing step.
Err007 Some analytics tools are not installed.
Err008 Incorrect password.
Err009 Octopus-toolkit can’t read/write files from your computer.
Err010 Incorrect number of Paired-End data.

If you have any questions, Please contact us at Octopustoolkit@gmail.com

7-2.Detail

Err001

Octopus-toolkit attempts to access the NCBI server (National Center for Biotechnology Information) to obtain sample information.

If your network connection is unstable, or the NCBI server is tempararily unavailable, Octopus-toolkit cannot get information for GSE and/or GSM.

First, check the network connection of your computer. If it is ok, please check the NCBI and whether the server is operating normally.

If the above cannot solve the problem, the connection to the NCBI may be timed out due to unknown reasons. Please re-run Octopus-toolkit after some time (temporary phenomenon).

Err002

Octopus-toolkit obtains sample information from the GEO (gene expression omnibus) website.

  • GEO Accession Number

    A GSExxx is a unique GEO accession number assigned to a study.
    A GSMxxx is a unique GEO accession number assigned to a sample. A single GSE (study) can have a number of GSM (samples).
    

Octopus-toolkit can only process registered GSE or GSM ids in GEO. Err002 occurs when you put unregistered accession ids or misspelled accession ids.

  • Unregistered GSE id (Input : GSE999999)
_images/Err002_Not_Exist.png
  • Misspelled or incorrect accession number (Input : ChIP-Seq)
_images/Err002_Wrong_Accession_Number.png

Please check the GEO accession number whether it is registered in the GEO.

Err003

There are many different types of next-generation sequencing (NGS) data. As defined by NCBI (NGS data - Study type), genome binding/occupancy profiling by high throughput sequencing indicates ChIP-seq data.

Octopus-toolkit currently supports the following types of NGS data. Other NGS types will be skipped. expression profiling by high throughput sequencing (RNA-seq) genome binding/occupancy profiling by high throughput sequencing (ChIP-seq / MNase-seq / ATAC-seq / MeDIP-seq / DNase-seq)

(Other NGS types will be added later)

You can check experiment type of given GEO accession number through the website. (ex: GSE79452)

  • Experiment Type
_images/Err003_Experiment_Type.png

Err004

Not all data in the GEO can be processed with the Octopus-toolkit. Octopus-toolkit check the following information before the processing. Organism, Library strategy, Instrument model, and FTP Address(SRA Experiment)). (Important)

_images/Err004_GSM_Info.png

Err004 is divided into the following four subcategories.

Sub Error ID Description
Err004-1 The organism is not supported.
Err004-2 The experiment type is not supported (for example Exome-seq).
Err004-3 The instrument is not supported. Octopus-toolkit can only process data generated by Illumina instrument.
Err004-4 Raw data (.sra) is currently unavailable (probably newly registered data).

Err004 is related to unsupported data by Octopus-toolkit. The following data is currently handled with Octopus-toolkit.

Type Description
Organism Homo sapiens, Mus musculus, Drosophila melanogaster, Saccharomyces cerevisiae, Canis lupus familaris, Arabidopsis thaliana, Danio rerio, Caenorhabditis elegans
Library Strategy ChIP-Seq, RNA-Seq, MeDIP-Seq, ATAC-Seq, DNase-Seq, MNase-Seq
Instrument Model Illumina GA/HiSeq/MiSeq (Illumina)

Err004-4 indicates that data has been registered in the GEO, but the raw data (.sra) has not been released yet. Therefore, please check the availability of raw files.

_images/Err004-4_Example.png
  • No raw files (.sra).
_images/Err004-4_Not_Exist_Page.png

Err005

This error is related to disk space. To resolve this issue, obtain enough free space (more than 10Gb) and re-run the analysis.

  • Check your hard disk space.
_images/Err005_File_System_Monitor.png
  • Status window.
_images/Err005_Running_info.png

Err006

Err006 is divided into six subcategories.

Sub Error ID Description
Err006-1 Cannot access NCBI’s FTP server.
Err006-2 File converting error from .sra to .fastq using fastq-dump.
Err006-3 Related to the .fastq file while checking the quality using FastQC.
Err006-4 No input file (.fastq) for Trimming.
Err006-5 Related to the Mapping step.
Err006-6 Related to the Sorting step (BAM file).

Err006-1

NCBI provides raw data of published sample through FTP server to user. If the NCBI homepage is working normally, you can extract the sample information, but if the FTP server does not work, you will not be able to download the data.

To solve this issue, you connect directly to the FTP server of NCBI.

_images/Err006-1_Example.png

If you can connect to the FTP server, manually download the published sample.

  • NCBI Ftp server is running.(Success)
_images/Err006-1_Success.png

If the server is closed or samples can not be downloaded, please contact the NCBI because it is an issue for the NCBI.

  • NCBI Ftp server is closed.(Fail)
_images/Err006-1_Fail.png

If the above method works normally, please try Octopus-toolkit again.

If you still have an Err006-1 in the retrial, please contact us at the address below.

Contact us : Octopustoolkit@gmail.com

Err006-2

Raw data of samples downloaded from NCBI is compressed in SRA format. For NGS analysis, SRA file should be converted to Fastq format. The tool used in this step is Fastq-dump, a sub tool of SRA-Toolkit.

  • Input file : Sequence Read Archive (Extension : sra)
  • Output file : Short read sequence. (Extension : fastq)

006-2 occurs when there is no or invalid SRA file, which is the input file for executing Fastq-dump.

This error may arise due to an abrupt disconnection during the previous downloading step of the raw data from FTP server, or raw data uploaded to NCBI may be broken.

You should check your network status, free space on your computer and try the analysis again.

If the above method does not work, please contact us at the address below.

Contact us : Octopustoolkit@gmail.com

Err006-3

Err006-3 means that the input file(Fastq) for the Quality Check is invalid or an issue in the system itself during Quality Check using FastQC.

You should check fastq files on your computer and try the analysis again.

If the above method does not work, please contact us at the address below.

Contact us : Octopustoolkit@gmail.com

After successfully completing the Quality Check step, some problems may prevent FastQC from generating Fastqc_data.txt.

Octopus-toolkit extracts the encoding information of the sample from fastqc_data.txt among the outputs of FastQC. Therefore, if Fastqc_data.txt is not generated, it stores the encoding information of the latest samples. (Sanger / Illumina 1.9)

  • Err006-3 Encoding information:
_images/Err006-3_Encoding.png

Err006-4

Err006-4 occurs when there is no input file(Fastq) for Trimming step or when all reads are removed due to bad quality.

You should check fastq files on your computer and try the analysis again.

If the above method does not work, please contact us using address below.

Contact us : Octopustoolkit@gmail.com

If all reads are removed by bad quality, Octopus-toolkit will use the non-trimmed input file(Fastq) to proceed. (Next step : Mapping)

Err006-5

Err006-5 may arise due to the following reasons.

  • The input file (non_trimmed Fastq, Trimmed Fastq) does not exist.
  • A large number of reads are trimmed due to bad sequencing quality or high threshold used.
  • Too few mapped reads (Less than 2 MegaByte).

You should check your input file (non-trimmed and trimmed fastq files), read count, file size after timming.

Err006-6

Err006-6: BAM (mapped) file does not exist or the number of mapped reads is too small.

You should check input file and BAM file.

Err007

Err007 is related to the installation step.

To use the Octopus-toolkit, your must follow the installation procedure completely: Requirement(Err007-1) and analysis tools(Err007-2).

  • Requirement : Library files must be installed.
  • Analysis tools : Tools are installed automatically by Octopus-toolkit. If the installation procedure is interrupted, please remove the Octopus-toolkit directory and rerun it.

Octopus-toolkit download files from the HOMER website. Err007 occurs if the website (http://homer.ucsd.edu/homer/) is unavailable, Err007 can occur.

Err008

Err008 is related to password issue.

  • Password : You must enter your password once during the installtion step.

Please check your password and try again.

  • When you enter incorrect password (Example : My password = ktm123)
_images/Err008_Wrong_Password.png

Err009

Err009 is related to script files generated by Octopus-toolkit. If this happens, please rerun it later.

Err010

Err010 indicates that the number of files (paired-end sample) does not match when merging.

If there are several SRA files in one sample (GSM), Octopus-toolkit will merge them.

Paired-end data must have two files, Sample1_1.fastq and Sample1_2.fastq.

Err010 occurs if any of these fails.