AlphaGenome Specification

GammaHelix is explicitly designed around the architectural requirements of the AlphaGenome predictive model. We have optimized our pipeline to handle the model's specific constraints so researchers can focus on biology, not boilerplate code.

The 1Mb Context Window

The underlying model requires an exact input length of 1,048,576 base pairs to accurately compute 3D chromatin folding and distal enhancer-promoter interactions. * Our Implementation: When you input a shorter sequence (e.g., a 1.2kb promoter region), GammaHelix's backend automatically centers your sequence and applies N (unknown) padding to the flanks. This satisfies the tensor shape requirements without creating artificial regulatory noise.

Academic References

For a comprehensive understanding of the original model weights, training data, and foundational architecture, please refer to the original publications: * AlphaGenome Original Preprint (Google DeepMind) * Nature Publication / Peer-Reviewed Manuscript