Shing Lyu

Demystifying The Options for Triggering AWS CodePipeline with Amazon S3 Events

By Shing Lyu    

Disclaimer: This content reflects my personal opinions, not those of any organizations I am or have been affiliated with. Code samples are provided for illustration purposes only, use with caution and test thoroughly before deployment.

Triggering AWS code pipeline when new files are uploaded to S3 is a very common use case. For example, when new data is uploaded, you can trigger a CodePipeline that triggers SageMaker model retraining or inference. However, the documentation and services involved in this process have gone through multiple updates, making it confusing for users to understand the current recommended approach. In this post, I will try to untangle the different options and let you know which one is the most up-to-date and recommended approach.

The History

To understand how these services evolved and what is the most up-to-date method, let’s first take a look at the three major components involved in this process: Amazon S3, the trigger mechanism, and AWS CodePipeline, and their respective histories.

Amazon S3:

CodePipeline:

Trigger mechanism:

Putting them together, here is the timeline of the possible combinations:

Mixed document versions

Due to the various updates, the documentation and AWS Console have different versions:

Conclusion

If you don’t have any legacy setup (e.g. has polling pipeline, still using CloudWatch Events), the recommended approach is to use the 2021 method:

  1. Enable S3 Event Notifications for EventBridge in the S3 bucket settings

  2. Create an EventBridge rule to trigger CodePipeline on S3 changes.

The documentation for this approach can be found here: Migrate polling pipelines to use event-based change detection - AWS CodePipeline.

Want to learn Rust? Check out my book: