Technical

Data Bank connectors

How the three live connectors talk to public services — and where their results land.

cBioPortal

services/connectors/cbioportal.py runs sync_studies() across ~20 cancer types via a CANCER_HINTS dict. Each study becomes a Dataset under the cBioPortal source.

GDC / TCGA

services/connectors/gdc.py pulls project summaries for seven TCGA projects (LUAD, LUSC, BRCA, COAD, READ, PRAD, PAAD). Project metadata + counts land in DataSet rows.

ClinicalTrials.gov

services/connectors/clinicaltrials.py runs sync_trials() with condition='cancer' and stores PublicTrial rows. Pan-cancer.

Ingestion jobs

Every run records a DataIngestionJob with status (running / success / failed) and a log. The admin / Data Bank UI lists them under 'Ingestion jobs'.