Creating a CI/CD pipeline for Data / ETL

Has anyone used SST/CDK for setting up data pipelines? I imagine this will require things like Redshift, Glue etc. But, then how would any logic that’s defined in Glue be tested as part of the pipeline…

So, far I had been doing ETLs using lambdas and a DocumentDb. Also used JS instead of Python which made it really hard to work with data! I’ve only started looking into how to use Glue and whether that might be a better fit - but equally I want to keep all the CI/CD awesomeness of pipelines.