Summary
BioTeam partnered with a major hyperscaler to migrate generative AI tools used in computational chemistry workflows to a managed cloud service running Nextflow. This involved containerizing and deploying ML models while adhering to MLOps best practices. BioTeam leveraged Nextflow’s concurrency features to facilitate highly parallel workflows.
Services Provided:
- Reviewed existing infrastructure and created architectural diagrams
- Automated data and model staging for a secure cloud environment
- Added CI/CD workflows for automation and code quality
- Migrated deep learning models and workflows to the cloud, including AlphaFold, ProteinMPNN, RFdiffusion, OpenFold, and others
- Developed schemas for integration into a composable AI platform, incorporating models like RFantibody, Aggrescan3D, PLM Pseudo Perplexity, TemStaPro, and others
- Optimized and right-sized resources
- Modernized workflows for increased parallelism and concurrency
Challenges
The hyperscaler serves customers who rely on compute-intensive drug discovery and computational chemistry software. Without proper optimization, inefficiencies waste time and money. While customers require highly optimized software, they do not want the burden of managing underlying infrastructure.
Approach
BioTeam collaborated closely with the hyperscaler to migrate computational chemistry software to a managed cloud service. This migration included:
- Assessment & Planning: Analyzed existing generative AI tools and dependencies and created a detailed migration plan to Nextflow and managed cloud services.
- Nextflow Pipeline Development: Built and configured pipelines for each generative AI tool, ensuring proper execution and data flow.
- Cloud Integration: Established secure and efficient data ingress and egress while meeting privacy and compliance standards.
- Optimization & Testing: Tuned pipeline performance with rigorous testing for accuracy, scalability, and reproducibility.
Outcomes
The workflows were successfully migrated to a managed cloud service, improving accessibility, scalability, and reproducibility for computational chemistry and drug discovery on the hyperscaler.

