
Step 1: Audit and learn
Before deleting anything, take the time to understand who and what are still accessing the bucket. Use logs to check for recent traffic. If you see active traffic requests coming from old versions of your app, third-party services, and users, investigate them. Pay extra attention to requests attempting to pull executable code, machine learning models, dynamic web content (such as Java Script), and sensitive configuration files.
You might see a lot of requests coming from bots, data crawlers, and scanners by checking the user agent of the requester. Their requests are essentially background noise, and don’t indicate that the bucket is actively required for your systems to function correctly. These are not dangerous and can be safely disregarded because they don’t represent legitimate traffic from your applications and users.
Step 2: Delete with confidence
Many automated processes and user activities don’t happen every day, so it’s important to wait at least a week before deleting the bucket. Waiting for at least a week increases the confidence that you’ve observed a full cycle of activity, including:
- Weekly reports: Scripts that generate reports and perform data backups on a weekly schedule.
- Batch jobs: Automated tasks that might only run over the weekend or on a specific day of the week.
- Infrequent user access: Users who may only use a feature that relies on the bucket’s data once a week.
After you’ve verified that no legitimate traffic is hitting the bucket for at least a week, and you’ve updated all of your legacy code, then you can proceed with deleting the bucket. Deleting a Google Cloud project effectively deletes all resources associated with it, including all Google Cloud Storage buckets.
Next, find and fix code that references dangling buckets
Preventing future issues is key, but you may have references to dangling buckets in your environment right now. Here’s a plan to hunt them down and fix them.
Step 3: Proactive discovery
Analyze your logs: This is one of your most powerful tools. Query your Cloud Logging data for server and application logs showing repeated 404 Not Found
errors for storage URLs. For example, a high volume of failed requests to the same non-existent bucket name is a major red flag (and to remediate it, we recommend you continue with Step 3 and then proceed to Step 4.)
Scan your codebase and documentation: Perform a comprehensive scan of all your private and open-source code repositories (including old and archived ones), configurations, and documentation for any references to your storage bucket names that may no longer be in use. One of the ways to find them is to look for the following patterns:
You can find whether a bucket still exists by querying https://storage.googleapis.com/{your-bucket-name}
. If you see response NoSuchBucket
, it means you identified a dangling bucket reference.
Source Credit: https://cloud.google.com/blog/products/identity-security/best-practices-to-prevent-dangling-bucket-takeovers/