The goal: beat (or at least replicate) Airbnb's amenity detection (detecting key household items in images), publish all the code and have the model accessible in a demo app someone can use on their phone. The full solution ended up being: data collected from Open Images, modelled with Detectron2, front-end application built with Streamlit and deployed using Docker, Google Container Registry and Google App Engine. I documented the entire journey day-by-day in Notion along with weekly YouTube videos discussing progress, open-sourced all code and built a tutorial in Colab where you can use my trained model (see the links).
Note to self: You learn the most working on your own projects. Modelling is the easy part, collecting data and getting your application live to users is the challenging part. Detectron2 is a powerful beast for computer vision tasks but may be overkill as a standalone choice of modelling platform. It's size led to difficulties when deploying. Next time, I'll start as simple as possible, adding complexity when needed.
Don't forget to tag @mrdbourke in your comment.