SCRIdb.worker.worker_main¶
-
SCRIdb.worker.
worker_main
(f_in, source_path=None, target_path=None, runseqc=True, hashtag=True, vdj=True, atac=True, cr=True, no_rsync=False, save=False, **args)¶ A method to process raw sequencing data returned from IGO. Newly sequenced samples are copied from IGO shared drive to a defined
S3URI
. Then, the proper pipeline is called to process the copied raw data.- Parameters
f_in (
Union
[str
,list
]) – Input file name, a single sample name, or a list of sample names, sequenced and ready to be processed.source_path (
Optional
[str
]) – Source path to parent directory of sequenced samples, usually an IGO shared drive.target_path (
Optional
[str
]) – Target path to parent directory of sequenced samples, usually, aS3URI
.runseqc (
bool
) – Callseqc
pipeline. Default:True
.hashtag (
bool
) – Callhashtag
pipeline. Default:True
.vdj (
bool
) – CallVDJ
pipeline. Default:True
.atac (
bool
) – Callatac-seq
pipeline. Default:True
.cr (
bool
) – CallCell Ranger
pipeline. Default:True
.no_rsync (
bool
) – Skip copying files toS3
.save (
bool
) – Writesample_data
to.csv
output configured in--results_output
.args – Additional args passed to other methods.
- Return type
None
- Returns
None
.
Example
>>> from SCRIdb.worker import * >>> args = json.load(open(os.path.expanduser("~/.config.json"))) >>> args["jobs"] = "jobs.yml" >>> args["seqcargs"] = {"min-poly-t": 0} >>> db_connect.conn(args) >>> worker_main( f_in=[ "Sample_CCR7_DC_1_IGO_10587_12", "Sample_CCR7_DC_2_IGO_10587_13", "Sample_CCR7_DC_3_IGO_10587_14", "Sample_CCR7_DC_4_IGO_10587_15" ], source_path="/Volumes/peerd/FASTQ/Project_10587/MICHELLE_0194", target_path="s3://dp-lab-data/sc-seq/Project_10587", runseqc = False, no_rsync = True, **args )