chado.load package

Module contents

Contains loader methods

class chado.load.LoadClient(engine, metadata, session, ci)

Bases: chado.client.Client

blast(analysis_id, organism_id, input, blastdb=None, blastdb_id=None, re_name=None, query_type='polypeptide', match_on_name=False, skip_missing=False)

Load a blast analysis, in the same way as does the tripal_analysis_blast module

Parameters:
  • analysis_id (int) – Analysis ID
  • organism_id (int) – Organism ID
  • input (str) – Path to the Blast XML file to load
  • blastdb (str) – Name of the database blasted against (must be in the Chado db table)
  • blastdb_id (int) – ID of the database blasted against (must be in the Chado db table)
  • query_type (str) – The feature type (e.g. ‘gene’, ‘mRNA’, ‘polypeptide’, ‘contig’) of the query. It must be a valid Sequence Ontology term.
  • match_on_name (bool) – Match features using their name instead of their uniquename
  • re_name (str) – Regular expression to extract the feature name from the input file (first capturing group will be used).
  • skip_missing (bool) – Skip lines with unknown features or GO id instead of aborting everything.
Return type:

dict

Returns:

Number of processed hits

go(input, organism_id, analysis_id, query_type='polypeptide', match_on_name=False, name_column=2, go_column=5, re_name=None, skip_missing=False)

Load GO annotation from a tabular file, in the same way as does the tripal_analysis_go module

Parameters:
  • input (str) – Path to the input tabular file to load
  • organism_id (int) – Organism ID
  • analysis_id (int) – Analysis ID
  • query_type (str) – The feature type (e.g. ‘gene’, ‘mRNA’, ‘polypeptide’, ‘contig’) of the query. It must be a valid Sequence Ontology term.
  • match_on_name (bool) – Match features using their name instead of their uniquename
  • name_column (int) – Column containing the feature identifiers (2, 3, 10 or 11; default=2).
  • go_column (int) – Column containing the GO id (default=5).
  • re_name (str) – Regular expression to extract the feature name from the input file (first capturing group will be used).
  • skip_missing (bool) – Skip lines with unknown features or GO id instead of aborting everything.
Return type:

dict

Returns:

Number of inserted GO terms

interpro(analysis_id, organism_id, input, parse_go=False, re_name=None, query_type='polypeptide', match_on_name=False, skip_missing=False)

Load an InterProScan analysis, in the same way as does the tripal_analysis_intepro module

Parameters:
  • analysis_id (int) – Analysis ID
  • organism_id (int) – Organism ID
  • input (str) – Path to the InterProScan file to load
  • parse_go (bool) – Load GO annotation to the database
  • query_type (str) – The feature type (e.g. ‘gene’, ‘mRNA’, ‘polypeptide’, ‘contig’) of the query. It must be a valid Sequence Ontology term.
  • match_on_name (bool) – Match features using their name instead of their uniquename
  • re_name (str) – Regular expression to extract the feature name from the input file (first capturing group will be used).
  • skip_missing (bool) – Skip lines with unknown features or GO id instead of aborting everything.
Return type:

dict

Returns:

Number of processed hits