From d17e10dcebef3ae5ef190bb6c16c4aa3cc77eae3 Mon Sep 17 00:00:00 2001 From: "janitbidhan@gmail.com" Date: Mon, 19 Feb 2024 21:22:44 -0800 Subject: [PATCH 1/9] docs(samples): Updating readme for soft delete cost analyzer script --- storage/cost-analysis/README.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index d242d41d43c..b0964ac4349 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -27,12 +27,12 @@ NOTE: Due to the specific functionality related to Google Cloud APIs, this guide 3. A Python environment (https://cloud.google.com/python/setup) **Command-Line Arguments** -* `project_name`_ (Required): Specifies your GCP project name. -* `--cost_threshold`_ (Optional, default=0): Sets a relative cost threshold. -* `--soft_delete_window`_ (Optional, default= 604800 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. -* `--agg_days`_ (Optional, default=30): The period over which to combine and aggregate results. -* `--lookback_days`_ (Optional, default=360): Time window (in days) for considering the how old the bucket to be. -* `--list`_ (Optional): Produces a simple list of bucket names. +* `project_name` - (**Required**): Specifies your GCP project name. +* `--cost_threshold` - (Optional, default=0): Sets a relative cost threshold. +* `--soft_delete_window` - (Optional, default= 604800 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. +* `--agg_days` - (Optional, default=30): The period over which to combine and aggregate results. +* `--lookback_days` - (Optional, default=360): Time window (in days) for considering the how old the bucket to be. +* `--list` - (Optional, default=False): Produces a simple list of bucket names. Note: In this sample, cost_threshold 0.15 would spotlight buckets where enabling soft delete might increase costs by over 15%. @@ -40,7 +40,7 @@ Note: In this sample, cost_threshold 0.15 would spotlight buckets where enabling $ python storage_soft_delete_relative_cost_analyzer.py my-project-name ``` -**Important Note:** To disable soft-delete for buckets flagged by the script, follow these steps: +To disable soft-delete for buckets flagged by the script, follow these steps: ```code-block::bash # 1. Run the analyzer to generate a list of buckets exceeding your cost threshold: @@ -49,4 +49,6 @@ python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHE # 2. Update the buckets using the generated list: cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete -``` \ No newline at end of file +``` + +**Important Note:** Disabling soft-delete for flagged buckets means when deleting it will permanently delete files. These files cannot be restored, even if a soft-delete policy is later re-enabled. From d579e6b4b41bf390ee635799848fcd4d6474ce2c Mon Sep 17 00:00:00 2001 From: "janitbidhan@gmail.com" Date: Tue, 20 Feb 2024 13:13:46 -0800 Subject: [PATCH 2/9] Updating the way of calculations Reviewed by team: cl/608673793 --- storage/cost-analysis/README.md | 4 +- ...rage_soft_delete_relative_cost_analyzer.py | 63 +++++++++++-------- 2 files changed, 38 insertions(+), 29 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index b0964ac4349..a80a620ce31 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -34,10 +34,10 @@ NOTE: Due to the specific functionality related to Google Cloud APIs, this guide * `--lookback_days` - (Optional, default=360): Time window (in days) for considering the how old the bucket to be. * `--list` - (Optional, default=False): Produces a simple list of bucket names. -Note: In this sample, cost_threshold 0.15 would spotlight buckets where enabling soft delete might increase costs by over 15%. +Note: In this sample, if setting cost_threshold 0.15 would spotlight buckets where enabling soft delete might increase costs by over 15%. ``` code-block:: bash - $ python storage_soft_delete_relative_cost_analyzer.py my-project-name + $ python storage_soft_delete_relative_cost_analyzer.py [your-project-name] ``` To disable soft-delete for buckets flagged by the script, follow these steps: diff --git a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py index c2641214055..ea650ab8d52 100644 --- a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py +++ b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py @@ -14,16 +14,16 @@ # See the License for the specific language governing permissions and # limitations under the License. -""" -Identifies buckets with relative increase in cost on enabling the soft-delete. +"""Identifies buckets with relative increase in cost on enabling the soft-delete. The relative increase in cost of using soft delete is calculated by combining -the storage/v2/deleted_bytes metric with the existing storage/v2/total_byte_seconds +the storage/v2/deleted_bytes metric with the existing +storage/v2/total_byte_seconds metric. -Relative cost of each bucket = ('soft delete retention duration' - × 'deleted bytes' / 'total bytes seconds' ) - x 'cost of storing in storage class' +Relative cost of each bucket = ('soft delete retention duration' + × 'deleted bytes' / 'total bytes seconds' ) + x 'cost of storing in storage class' x 'ratio of storage class'. """ @@ -49,7 +49,7 @@ def get_relative_cost(storage_class: str) -> float: "STANDARD": 0.023 / 0.023, "NEARLINE": 0.013 / 0.023, "COLDLINE": 0.007 / 0.023, - "ARCHIVE": 0.0025 / 0.023 + "ARCHIVE": 0.0025 / 0.023, } return relative_cost.get(storage_class, 1.0) @@ -59,7 +59,7 @@ def get_soft_delete_cost( project_name: str, soft_delete_window: int, agg_days: int, - lookback_days: int + lookback_days: int, ) -> Dict[str, List[Dict[str, float]]]: """Calculates soft delete costs for buckets in a Google Cloud project. @@ -124,19 +124,25 @@ def calculate_soft_delete_costs( monitoring_client.QueryTimeSeriesRequest( name=f"projects/{project_name}", query=f""" - {{ - fetch gcs_bucket :: storage.googleapis.com/storage/v2/deleted_bytes - | group_by [resource.bucket_name, metric.storage_class, resource.location], window(), .sum; - fetch gcs_bucket :: storage.googleapis.com/storage/v2/total_byte_seconds - | group_by [resource.bucket_name, metric.storage_class, resource.location], window(), .sum - }} - | ratio # Calculate ratios of deleted btyes to total bytes seconds - | value val(0) * {soft_delete_window}\'s\' - | every {agg_days}d - | within {lookback_days}d + {{ # Fetch 1: Soft-deleted (bytes seconds) + fetch gcs_bucket :: storage.googleapis.com/storage/v2/deleted_bytes + | value val(0) * {soft_delete_window}\'s\' # Multiply by soft delete window + | group_by [resource.bucket_name, metric.storage_class], window(), .sum; + + # Fetch 2: Total byte-seconds (active objects) + fetch gcs_bucket :: storage.googleapis.com/storage/v2/total_byte_seconds + | filter metric.type != 'soft-deleted-object' + | group_by [resource.bucket_name, metric.storage_class], window(1d), .mean # Daily average + | group_by [resource.bucket_name, metric.storage_class], window(), .sum # Total over window + + }} # End query definition + | every {agg_days}d # Aggregate over larger time intervals + | within {lookback_days}d # Limit data range for analysis + | ratio # Calculate ratio (soft-deleted (bytes seconds)/ total (bytes seconds)) """, ) ) + buckets: Dict[str, List[Dict[str, float]]] = {} missing_distribution_storage_class = [] for data_point in soft_deleted_bytes_time.time_series_data: @@ -149,7 +155,8 @@ def calculate_soft_delete_costs( soft_delete_ratio = data_point.point_data[0].values[0].double_value distribution_storage_class = bucket_name + " - " + storage_class storage_class_ratio = storage_ratios_by_bucket.get( - distribution_storage_class) + distribution_storage_class + ) if storage_class_ratio is None: missing_distribution_storage_class.append( distribution_storage_class) @@ -159,12 +166,14 @@ def calculate_soft_delete_costs( # 'location': location, "soft_delete_ratio": soft_delete_ratio, "storage_class_ratio": storage_class_ratio, - "relative_storage_class_cost": get_relative_cost(storage_class) + "relative_storage_class_cost": get_relative_cost(storage_class), }) if missing_distribution_storage_class: - print("Missing storage class for following buckets:", - missing_distribution_storage_class) + print( + "Missing storage class for following buckets:", + missing_distribution_storage_class, + ) raise ValueError("Cannot proceed with missing storage class ratios.") return buckets @@ -321,14 +330,14 @@ def soft_delete_relative_cost_analyzer_main() -> None: args.lookback_days, args.list, ) - if (not args.list): + if not args.list: print( "To remove soft-delete policy from the listed buckets run:\n" # Capture output - "python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHER_OPTIONS] --list >" - " list_of_buckets.txt\n" - "cat list_of_buckets.txt | gcloud storage buckets update -I" - " --clear-soft-delete\n", + "python storage_soft_delete_relative_cost_analyzer.py" + " [your-project-name] --[OTHER_OPTIONS] --list >" + " list_of_buckets.txt\ncat list_of_buckets.txt | gcloud storage buckets" + " update -I --clear-soft-delete\n", "\nThe buckets with approximate costs for soft delete:\n", response, ) From 4a5ef6e83c7e3cc2b2ba81c4a403544a6dc800f6 Mon Sep 17 00:00:00 2001 From: "janitbidhan@gmail.com" Date: Tue, 20 Feb 2024 13:29:35 -0800 Subject: [PATCH 3/9] Updating the comments --- storage/cost-analysis/README.md | 7 +++++-- .../storage_soft_delete_relative_cost_analyzer.py | 6 ++---- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index a80a620ce31..ed2de0f9894 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -43,10 +43,13 @@ Note: In this sample, if setting cost_threshold 0.15 would spotlight buckets whe To disable soft-delete for buckets flagged by the script, follow these steps: ```code-block::bash -# 1. Run the analyzer to generate a list of buckets exceeding your cost threshold: +# 1. Authenticate (if needed): If you're not already authenticated or prefer a specific account, run: +gcloud auth application-default login + +# 2. Run the analyzer to generate a list of buckets exceeding your cost threshold: python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHER_OPTIONS] --list=True > list_of_buckets.txt -# 2. Update the buckets using the generated list: +# 3. Update the buckets using the generated list: cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete ``` diff --git a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py index ea650ab8d52..ec90f21f706 100644 --- a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py +++ b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py @@ -335,10 +335,8 @@ def soft_delete_relative_cost_analyzer_main() -> None: "To remove soft-delete policy from the listed buckets run:\n" # Capture output "python storage_soft_delete_relative_cost_analyzer.py" - " [your-project-name] --[OTHER_OPTIONS] --list >" - " list_of_buckets.txt\ncat list_of_buckets.txt | gcloud storage buckets" - " update -I --clear-soft-delete\n", - "\nThe buckets with approximate costs for soft delete:\n", + " [your-project-name] --[OTHER_OPTIONS] --list > list_of_buckets.txt" + "cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete", response, ) return From f51aea90f1583437494d891ec90218fda1b45251 Mon Sep 17 00:00:00 2001 From: "janitbidhan@gmail.com" Date: Tue, 20 Feb 2024 14:30:03 -0800 Subject: [PATCH 4/9] reformatting the comments --- .../storage_soft_delete_relative_cost_analyzer.py | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py index ec90f21f706..8a160004d74 100644 --- a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py +++ b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py @@ -335,8 +335,9 @@ def soft_delete_relative_cost_analyzer_main() -> None: "To remove soft-delete policy from the listed buckets run:\n" # Capture output "python storage_soft_delete_relative_cost_analyzer.py" - " [your-project-name] --[OTHER_OPTIONS] --list > list_of_buckets.txt" - "cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete", + " [your-project-name] --[OTHER_OPTIONS] --list > list_of_buckets.txt \n" + "cat list_of_buckets.txt | gcloud storage buckets update -I " + "--clear-soft-delete", response, ) return From 589a2f72fad80d6a15e48e0737c4e8a00c8231bd Mon Sep 17 00:00:00 2001 From: "janitbidhan@gmail.com" Date: Thu, 22 Feb 2024 09:30:31 -0800 Subject: [PATCH 5/9] fix(sample) - Update seconds from int to float in the soft delete relative cost analyzer. --- storage/cost-analysis/README.md | 2 +- .../storage_soft_delete_relative_cost_analyzer.py | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index ed2de0f9894..eb7f976ff8c 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -29,7 +29,7 @@ NOTE: Due to the specific functionality related to Google Cloud APIs, this guide **Command-Line Arguments** * `project_name` - (**Required**): Specifies your GCP project name. * `--cost_threshold` - (Optional, default=0): Sets a relative cost threshold. -* `--soft_delete_window` - (Optional, default= 604800 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. +* `--soft_delete_window` - (Optional, default= 604800.0 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. * `--agg_days` - (Optional, default=30): The period over which to combine and aggregate results. * `--lookback_days` - (Optional, default=360): Time window (in days) for considering the how old the bucket to be. * `--list` - (Optional, default=False): Produces a simple list of bucket names. diff --git a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py index 8a160004d74..53a3f71259c 100644 --- a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py +++ b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py @@ -57,7 +57,7 @@ def get_relative_cost(storage_class: str) -> float: def get_soft_delete_cost( project_name: str, - soft_delete_window: int, + soft_delete_window: float, agg_days: int, lookback_days: int, ) -> Dict[str, List[Dict[str, float]]]: @@ -98,7 +98,7 @@ def get_soft_delete_cost( def calculate_soft_delete_costs( project_name: str, query_client: monitoring_client.QueryServiceClient, - soft_delete_window: int, + soft_delete_window: float, storage_ratios_by_bucket: Dict[str, float], agg_days: int, lookback_days: int, @@ -234,7 +234,7 @@ def get_storage_class_ratio( def soft_delete_relative_cost_analyzer( project_name: str, cost_threshold: float = 0.0, - soft_delete_window: int = 604800, + soft_delete_window: float = 604800, agg_days: int = 30, lookback_days: int = 360, list_buckets: bool = False, @@ -292,8 +292,8 @@ def soft_delete_relative_cost_analyzer_main() -> None: ) parser.add_argument( "--soft_delete_window", - type=int, - default=604800, + type=float, + default=604800.0, help="Time window (in seconds) for considering soft-deleted objects.", ) parser.add_argument( From 48aab3f3f2f19d9bd9a37499131f51006ac7a049 Mon Sep 17 00:00:00 2001 From: Janit Bidhan Date: Fri, 12 Apr 2024 10:52:59 -0700 Subject: [PATCH 6/9] Update README.md for Soft Delete Cost Analysis --- storage/cost-analysis/README.md | 92 +++++++++++++++++++++++++++++++-- 1 file changed, 87 insertions(+), 5 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index eb7f976ff8c..d9ac90b709a 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -11,20 +11,25 @@ NOTE: Due to the specific functionality related to Google Cloud APIs, this guide ### Google Cloud Storage Soft Delete Cost Analyzer ------------------------------------------------------------------------------- -**Understanding Soft Delete and Cost Considerations** +**Purpose** + +* Helps you understand the potential cost implications of enabling soft delete on your Google Cloud Storage buckets. +* Identifies buckets where soft delete might lead to significant cost increases. + +**Key Concepts** 1. Soft Delete: A feature for protecting against accidental data loss. Deleted objects are retained for a defined period before permanent deletion. This adds safety but carries potential additional storage costs. 2. Cost Analysis: This script evaluates the relative cost increase within each bucket if soft delete is enabled. Considerations include: - * Your soft delete retention window + * Your soft delete retention period * Amount of data likely to be soft-deleted * Proportions of data in different storage classes (e.g., Standard, Nearline) -**How to Use the Script** +**How to Use** **Prerequisites** 1. A Google Cloud Platform (GCP) Project with existing buckets. 2. Permissions on your GCP project to interact with Google Cloud Storage and Monitoring APIs. - 3. A Python environment (https://cloud.google.com/python/setup) + 3. A Python environment (https://cloud.google.com/python/setup) [ Python version > Python 3.11.6 ] **Command-Line Arguments** * `project_name` - (**Required**): Specifies your GCP project name. @@ -32,7 +37,7 @@ NOTE: Due to the specific functionality related to Google Cloud APIs, this guide * `--soft_delete_window` - (Optional, default= 604800.0 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. * `--agg_days` - (Optional, default=30): The period over which to combine and aggregate results. * `--lookback_days` - (Optional, default=360): Time window (in days) for considering the how old the bucket to be. -* `--list` - (Optional, default=False): Produces a simple list of bucket names. +* `--list` - (Optional, default=False): Produces a simple list of bucket names if set as True. Note: In this sample, if setting cost_threshold 0.15 would spotlight buckets where enabling soft delete might increase costs by over 15%. @@ -55,3 +60,80 @@ cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete ``` **Important Note:** Disabling soft-delete for flagged buckets means when deleting it will permanently delete files. These files cannot be restored, even if a soft-delete policy is later re-enabled. + +------------------------------------------------------------------------------- + +### SCRIPT EXPLAINATION +The `storage_soft_delete_relative_cost_analyzer.py` script assesses the potential cost impact of enabling soft delete on Google Cloud Storage buckets. It utilizes the Google Cloud Monitoring API to retrieve relevant metrics and perform calculations. + +#### Functionality: + +1. Calculating Relative Soft Delete Cost: + * Fetches data on soft-deleted bytes and total byte-seconds for each bucket and storage class using the Monitoring API. + * Calculates the ratio of soft-deleted bytes to total byte-seconds, representing the relative amount of inactive data. + * Considers storage class pricing to determine the cost impact of storing this inactive data. + +2. Identifying Costly Buckets: + + * Compares the calculated relative cost to a user-defined threshold. + * Flags buckets where soft delete might lead to significant cost increases. + +3. Output Options: + + * Can output a detailed JSON report with cost data for each bucket, suitable for further analysis or plotting. + * Alternatively, generates a simple list of bucket names exceeding the cost threshold. + +#### Key Functions: + + * `soft_delete_relative_cost_analyzer`: Handles command-line input and output, calling 'get_soft_delete_cost' for the project. + + * `get_soft_delete_cost`: Orchestrates the cost analysis, using: + + * `get_relative_cost`: Retrieves the relative cost multiplier for a given storage class (e.g., "STANDARD", "NEARLINE") compared to the standard class. The cost for each class are pre-defined within the function and could be adjusted based on regional pricing variations. + * `calculate_soft_delete_costs`: Executes Monitoring API queries and calculates costs. + * `get_storage_class_ratio`: Fetches data on storage class distribution within buckets. + +#### Monitoring API Queries + +The script relies on the Google Cloud Monitoring API to fetch essential data for calculating soft delete costs. It employs the `query_client.query_time_series` method to execute specifically crafted queries that retrieve metrics from Google Cloud Storage. + +1. `calculate_soft_delete_costs` + * This function encapsulates the most intricate query, which concurrently retrieves two metrics: + * `storage.googleapis.com/storage/v2/deleted_bytes`: This metric quantifies the volume of data, in bytes, that has undergone soft deletion. + * `storage.googleapis.com/storage/v2/total_byte_seconds`: This metric records the cumulative byte-seconds of data stored within the bucket, excluding objects marked for soft deletion. + * Subsequently, the query computes the ratio of these two metrics, yielding the proportion of soft-deleted data relative to the total data volume within each bucket. + +3. `get_storage_class_ratio` + + * This function employs a less complex query to re-acquire the `storage.googleapis.com/storage/v2/total_byte_seconds` metric again.metric. However, in this instance, it focuses on segregating and aggregating the data based on the storage class associated with each object within the bucket. + * The resultant output is a breakdown elucidating the distribution of data across various storage classes, facilitating a more granular cost analysis. For example, a result like `{ "bucket_name-STANDARD": 0.90, "bucket_name-NEARLINE": 0.10 }` indicates that the bucket's data is stored across two storage classes with a ratio of 9:1. + +#### Key Formula + +The relative increase in cost of using soft delete is calculated by combining the output of above mentioned queries and the for each bucket, + +``` +Relative cost of each bucket = ('deleted bytes' / 'total bytes seconds' ) + × 'soft delete retention duration' + x 'cost of storing in storage class' + x 'ratio of storage class'. +``` + +where, + * `Soft Delete Retention Duration`: The number of days (or seconds) that soft-deleted objects are retained before permanent deletion. Longer retention periods increase potential costs. + * `Deleted Bytes`: The amount of data (in bytes) that has been soft-deleted within the bucket. + * `Total Bytes Seconds`: A cumulative measure of all data stored in the bucket (including active and soft-deleted objects) over time, expressed in byte-seconds (bytes * seconds). + * `Cost of Storing in Storage Class`: The per-byte-second cost of the specific storage class where the soft-deleted data resides (e.g., Standard, Nearline, Coldline). + * `Ratio of Storage Class`: The proportion of the bucket's data that belongs to the specific storage class being considered. + +##### Explaination of each Steps: + +1. Soft Delete Ratio: Divide `Deleted Bytes` by `Total Bytes Seconds` to get the fraction of data that is soft-deleted. This indicates how much of the overall storage is occupied by inactive, potentially deletable data. +2. Cost Impact: + * Multiply the `Soft Delete Ratio` by the `Soft Delete Retention Duration` to account for the extra time this data is being stored. + * Multiply this result by the 'Cost of Storing in Storage Class` to factor in the pricing of the specific storage class. + * Finally, multiply by the `Ratio of Storage Class` to consider only the portion of the cost attributable to that particular class. + +The final result represents the relative increase in cost due to soft delete, expressed as a fraction or percentage. A higher value indicates a more significant cost impact. This allows you to assess whether the benefits of soft delete (data protection) outweigh the additional storage expenses for each bucket and storage class. Example: If the calculated relative cost increase is 0.15 (or 15%), it means that enabling soft delete for that bucket/storage class would increase your storage costs by approximately 15%. + + From 045d72bb88cff5e770ac714fab0b076ec3019d49 Mon Sep 17 00:00:00 2001 From: Janit Bidhan Date: Tue, 16 Apr 2024 12:51:21 -0700 Subject: [PATCH 7/9] Updated README.md to include script explanation --- storage/cost-analysis/README.md | 63 ++++++++++++++++++--------------- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index d9ac90b709a..b32210cd099 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -1,23 +1,23 @@ -Google Cloud Storage Python Samples +# Cloud Storage Python Samples =============================================================================== [![Open in Cloud Shell button](https://gstatic.com/cloudssh/images/open-btn.png)](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=storage/s3-sdk/README.rst) -**Google Cloud Storage:** https://cloud.google.com/storage/docs +**Cloud Storage:** https://cloud.google.com/storage/docs Samples ------------------------------------------------------------------------------- -NOTE: Due to the specific functionality related to Google Cloud APIs, this guide assumes a base level of familiarity with Google Cloud Storage features, terminology, and pricing. +NOTE: Due to the specific functionality related to Google Cloud APIs, this guide assumes a base level of familiarity with the Cloud Storage features, terminology, and pricing. ### Google Cloud Storage Soft Delete Cost Analyzer ------------------------------------------------------------------------------- **Purpose** * Helps you understand the potential cost implications of enabling soft delete on your Google Cloud Storage buckets. -* Identifies buckets where soft delete might lead to significant cost increases. +* Identifies buckets where the cost of enabling soft delete exceeds some threshold. **Key Concepts** - 1. Soft Delete: A feature for protecting against accidental data loss. Deleted objects are retained for a defined period before permanent deletion. This adds safety but carries potential additional storage costs. + 1. [Soft Delete](https://cloud.google.com/storage/docs/soft-delete): The soft delete feature lets you preserve all deleted or overwritten objects in your bucket for a specified duration. 2. Cost Analysis: This script evaluates the relative cost increase within each bucket if soft delete is enabled. Considerations include: * Your soft delete retention period * Amount of data likely to be soft-deleted @@ -47,19 +47,20 @@ Note: In this sample, if setting cost_threshold 0.15 would spotlight buckets whe To disable soft-delete for buckets flagged by the script, follow these steps: +1. Authenticate (if needed): If you're not already authenticated or prefer a specific account, run: ```code-block::bash -# 1. Authenticate (if needed): If you're not already authenticated or prefer a specific account, run: -gcloud auth application-default login - -# 2. Run the analyzer to generate a list of buckets exceeding your cost threshold: -python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHER_OPTIONS] --list=True > list_of_buckets.txt - -# 3. Update the buckets using the generated list: -cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete - +$ gcloud auth application-default login +``` +2. Run the analyzer to generate a list of buckets exceeding your cost threshold: +```code-block::bash +$ python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHER_OPTIONS] --list=True > list_of_buckets.txt +``` +3. Update the buckets using the generated list: +```code-block::bash +$ cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete ``` -**Important Note:** Disabling soft-delete for flagged buckets means when deleting it will permanently delete files. These files cannot be restored, even if a soft-delete policy is later re-enabled. +**Important Note:** Disabling soft-delete for flagged buckets means delete operations will permanently delete files. These files cannot be restored, even if a soft-delete policy is later re-enabled. ------------------------------------------------------------------------------- @@ -113,27 +114,31 @@ The script relies on the Google Cloud Monitoring API to fetch essential data for The relative increase in cost of using soft delete is calculated by combining the output of above mentioned queries and the for each bucket, ``` -Relative cost of each bucket = ('deleted bytes' / 'total bytes seconds' ) - × 'soft delete retention duration' - x 'cost of storing in storage class' - x 'ratio of storage class'. +Relative cost of each bucket = deleted_bytes / total_byte_seconds + x Soft delete retention duration-seconds + x Relative Storage Cost + x Storage Class Ratio ``` where, - * `Soft Delete Retention Duration`: The number of days (or seconds) that soft-deleted objects are retained before permanent deletion. Longer retention periods increase potential costs. - * `Deleted Bytes`: The amount of data (in bytes) that has been soft-deleted within the bucket. - * `Total Bytes Seconds`: A cumulative measure of all data stored in the bucket (including active and soft-deleted objects) over time, expressed in byte-seconds (bytes * seconds). - * `Cost of Storing in Storage Class`: The per-byte-second cost of the specific storage class where the soft-deleted data resides (e.g., Standard, Nearline, Coldline). - * `Ratio of Storage Class`: The proportion of the bucket's data that belongs to the specific storage class being considered. + + * `Deleted Bytes`: It is same as `storage/v2/deleted_bytes`. Delta count of deleted bytes per bucket, + * `Total Bytes Seconds`: It is same as `storage/v2/total_byte_seconds`. Total daily storage in byte*seconds used by the bucket, grouped by storage class and type where type can be live-object, noncurrent-object, soft-deleted-object and multipart-upload. + * `Soft delete retention duration-seconds`: Soft Delete window defined for the bucket, this is the threshold to be provided to test out this relative cost script. + * `Relative Storage Cost`: The cost of storing data in a specific storage class (e.g., Standard, Nearline, Coldline) relative to the Standard class (where Standard class cost is 1). + * `Storage Class Ratio`: The proportion of the bucket's data that belongs to the specific storage class being considered. + +Please note the following Cloud Monitoring metrics: +`storage/v2/deleted_bytes` and `storage/v2/total_byte_seconds` are defined on [https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage](https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage) ##### Explaination of each Steps: -1. Soft Delete Ratio: Divide `Deleted Bytes` by `Total Bytes Seconds` to get the fraction of data that is soft-deleted. This indicates how much of the overall storage is occupied by inactive, potentially deletable data. +1. Soft Delete Rate: Dividing 'Deleted Bytes' by 'Total Bytes Seconds' gives you the rate at which data is being soft-deleted (per second). This shows how quickly data marked for deletion accumulates in the bucket. 2. Cost Impact: - * Multiply the `Soft Delete Ratio` by the `Soft Delete Retention Duration` to account for the extra time this data is being stored. - * Multiply this result by the 'Cost of Storing in Storage Class` to factor in the pricing of the specific storage class. - * Finally, multiply by the `Ratio of Storage Class` to consider only the portion of the cost attributable to that particular class. + * Multiply the `Soft Delete Rate` by the `Soft delete retention duration-seconds` to get the total ratio of data that is soft-deleted and retained within the specified period. + * Multiply this result by the 'Relative Storage Cost` to factor in the pricing of the specific storage class. + * Finally, multiply by the `Storage Class Ratio` to consider only the portion of the cost attributable to that particular class. -The final result represents the relative increase in cost due to soft delete, expressed as a fraction or percentage. A higher value indicates a more significant cost impact. This allows you to assess whether the benefits of soft delete (data protection) outweigh the additional storage expenses for each bucket and storage class. Example: If the calculated relative cost increase is 0.15 (or 15%), it means that enabling soft delete for that bucket/storage class would increase your storage costs by approximately 15%. +The final result represents the relative increase in cost due to soft delete, expressed as a fraction or percentage. If cost is 1 then no cost increase otherwise more increase makes more cost. This allows you to assess whether the benefits of soft delete (data protection) outweigh the additional storage expenses for each bucket and storage class. Example: If the calculated relative cost increase is 0.15 (or 15%), it means that enabling soft delete for that bucket/storage class would increase your storage costs by approximately 15%. From 31f12c290fdd8deecdd327981fa2b97f0dd59d8d Mon Sep 17 00:00:00 2001 From: Janit Bidhan Date: Fri, 26 Apr 2024 13:44:42 -0700 Subject: [PATCH 8/9] Updated formula text in the storage_soft_delete_relative_cost_analyzer.py --- .../storage_soft_delete_relative_cost_analyzer.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py index 53a3f71259c..7103508fbc7 100644 --- a/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py +++ b/storage/cost-analysis/storage_soft_delete_relative_cost_analyzer.py @@ -21,10 +21,10 @@ storage/v2/total_byte_seconds metric. -Relative cost of each bucket = ('soft delete retention duration' - × 'deleted bytes' / 'total bytes seconds' ) - x 'cost of storing in storage class' - x 'ratio of storage class'. +Relative cost of each bucket = deleted_bytes / total_byte_seconds + x Soft delete retention duration-seconds + x Relative Storage Cost + x Storage Class Ratio """ # [START storage_soft_delete_relative_cost] From a0b64f8d0b926c03633b29ddf76de0a850fa17c8 Mon Sep 17 00:00:00 2001 From: Janit Bidhan Date: Fri, 26 Apr 2024 13:45:50 -0700 Subject: [PATCH 9/9] docs: Enhance README clarity with in-depth script details & key formula breakdown --- storage/cost-analysis/README.md | 308 +++++++++++++++++++++----------- 1 file changed, 199 insertions(+), 109 deletions(-) diff --git a/storage/cost-analysis/README.md b/storage/cost-analysis/README.md index b32210cd099..2dcd8271730 100644 --- a/storage/cost-analysis/README.md +++ b/storage/cost-analysis/README.md @@ -1,117 +1,185 @@ # Cloud Storage Python Samples + =============================================================================== [![Open in Cloud Shell button](https://gstatic.com/cloudssh/images/open-btn.png)](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=storage/s3-sdk/README.rst) -**Cloud Storage:** https://cloud.google.com/storage/docs - -Samples -------------------------------------------------------------------------------- -NOTE: Due to the specific functionality related to Google Cloud APIs, this guide assumes a base level of familiarity with the Cloud Storage features, terminology, and pricing. - -### Google Cloud Storage Soft Delete Cost Analyzer -------------------------------------------------------------------------------- -**Purpose** +**Cloud Storage** https://cloud.google.com/storage/docs -* Helps you understand the potential cost implications of enabling soft delete on your Google Cloud Storage buckets. -* Identifies buckets where the cost of enabling soft delete exceeds some threshold. +## Samples -**Key Concepts** - 1. [Soft Delete](https://cloud.google.com/storage/docs/soft-delete): The soft delete feature lets you preserve all deleted or overwritten objects in your bucket for a specified duration. - 2. Cost Analysis: This script evaluates the relative cost increase within each bucket if soft delete is enabled. Considerations include: - * Your soft delete retention period - * Amount of data likely to be soft-deleted - * Proportions of data in different storage classes (e.g., Standard, Nearline) +NOTE: Due to the specific functionality related to Google Cloud APIs, this guide +assumes a base level of familiarity with the Cloud Storage features, +terminology, and pricing. -**How to Use** +## Cloud Storage Soft Delete Cost Analyzer -**Prerequisites** +### Purpose - 1. A Google Cloud Platform (GCP) Project with existing buckets. - 2. Permissions on your GCP project to interact with Google Cloud Storage and Monitoring APIs. - 3. A Python environment (https://cloud.google.com/python/setup) [ Python version > Python 3.11.6 ] +* Helps you understand the potential cost implications of enabling + [Soft Delete](https://cloud.google.com/storage/docs/soft-delete) on your + Cloud Storage [buckets](https://cloud.google.com/storage/docs/buckets). +* Identifies buckets where the cost of enabling soft delete exceeds some + threshold based on past usage. -**Command-Line Arguments** -* `project_name` - (**Required**): Specifies your GCP project name. -* `--cost_threshold` - (Optional, default=0): Sets a relative cost threshold. -* `--soft_delete_window` - (Optional, default= 604800.0 (i.e. 7 days)): Time window (in seconds) for considering soft-deleted objects.. -* `--agg_days` - (Optional, default=30): The period over which to combine and aggregate results. -* `--lookback_days` - (Optional, default=360): Time window (in days) for considering the how old the bucket to be. -* `--list` - (Optional, default=False): Produces a simple list of bucket names if set as True. +### Prerequisites -Note: In this sample, if setting cost_threshold 0.15 would spotlight buckets where enabling soft delete might increase costs by over 15%. +* A + [Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects) + with existing buckets. +* Permissions on your Google Cloud project to interact with Cloud Storage and + Monitoring APIs. +* A [Python development environment](https://cloud.google.com/python/setup) + with version >= `3.11` -``` code-block:: bash - $ python storage_soft_delete_relative_cost_analyzer.py [your-project-name] -``` +### Running the script using the command line To disable soft-delete for buckets flagged by the script, follow these steps: -1. Authenticate (if needed): If you're not already authenticated or prefer a specific account, run: -```code-block::bash -$ gcloud auth application-default login -``` -2. Run the analyzer to generate a list of buckets exceeding your cost threshold: -```code-block::bash -$ python storage_soft_delete_relative_cost_analyzer.py [your-project-name] --[OTHER_OPTIONS] --list=True > list_of_buckets.txt -``` -3. Update the buckets using the generated list: -```code-block::bash -$ cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete -``` - -**Important Note:** Disabling soft-delete for flagged buckets means delete operations will permanently delete files. These files cannot be restored, even if a soft-delete policy is later re-enabled. - -------------------------------------------------------------------------------- - -### SCRIPT EXPLAINATION -The `storage_soft_delete_relative_cost_analyzer.py` script assesses the potential cost impact of enabling soft delete on Google Cloud Storage buckets. It utilizes the Google Cloud Monitoring API to retrieve relevant metrics and perform calculations. - -#### Functionality: +1. Authenticate (if needed): If you're not already authenticated or prefer a + specific account, run: + + ```code-block::bash + gcloud auth application-default login + ``` + +2. Run the analyzer to generate a list of buckets exceeding your cost + threshold: + + ```code-block::bash + python storage_soft_delete_relative_cost_analyzer.py [project_name] --[OTHER_OPTIONS] --list=True > list_of_buckets.txt + ``` + + ARGUMENTS: + + * `project_name` - (**Required**): Specifies your Google Cloud project + name. + * `--cost_threshold` - (Optional, default=0): Sets a relative cost + threshold. For example, if `cost_threshold` is set to 0.15, the script + will return buckets where the estimated relative cost increase exceeds + 15%. + * `--soft_delete_window` - (Optional, default= 604800.0 (i.e. 7 days)): + Time window (in seconds) for considering soft-deleted objects. + * `--agg_days` - (Optional, default=30): The period over which to combine + and aggregate results. + * `--lookback_days` - (Optional, default=360): Time window (in days) which + describes how far back in time the analysis should consider data. + * `--list` - (Optional, default=False): Produces a simple list of bucket + names if set as True. + + Example with all the optional parameters set to their default values: + + ```code-block::bash + python storage_soft_delete_relative_cost_analyzer.py [project_name] \ + --cost_threshold=0 \ + --soft_delete_window=604800.0 \ + --agg_days=30 \ + --lookback_days=360 \ + --list=true + ``` + +3. Update the buckets using the generated list: + + ```code-block::bash + cat list_of_buckets.txt | gcloud storage buckets update -I --clear-soft-delete + ``` + +**Important Note** If a bucket has soft delete +disabled, delete requests that include an object's generation number will +permanently delete the object. Additionally, any request in buckets with object +versioning disabled that cause an object to be deleted or overwritten will +permanently delete the object. + +-------------------------------------------------------------------------------- + +### Script Explanation + +The `storage_soft_delete_relative_cost_analyzer.py` script assesses the +potential cost impact of enabling soft delete on Cloud Storage buckets. It uses +the [Cloud Monitoring API](https://cloud.google.com/monitoring/api/v3) to +retrieve relevant metrics and perform calculations. + +#### Functionality + +1. Calculates the relative cost of soft delete: + + * Fetches data on soft-deleted bytes and total byte-seconds for each + bucket and + [storage class](https://cloud.google.com/storage/docs/storage-classes) + using the Monitoring API. + * Calculates the ratio of soft-deleted bytes to total byte-seconds, + representing the relative amount of inactive data. + * Considers + [storage class pricing](https://cloud.google.com/storage/pricing) is + relative to the Standard storage class to determine the cost impact of + storing this inactive data. + +2. Identifies buckets that exceed a cost threshold: + + * Compares the calculated relative cost to a user-defined threshold. + * Flags buckets where soft delete might lead to significant cost + increases. + +3. Provides two different output options: JSON or list of buckets. + + * Can output a detailed JSON with relative cost for each bucket, suitable + for further analysis or plotting. + * Alternatively, generates a simple list of bucket names exceeding the + cost threshold. This output can be directly piped into the gcloud + storage CLI as described above. + +#### Key Functions + +* `soft_delete_relative_cost_analyzer`: Handles command-line input and output, + calling 'get_soft_delete_cost' for the Google Cloud project. + * `get_soft_delete_cost`: Orchestrates the cost analysis, using: + * `get_relative_cost`: Retrieves the relative cost multiplier for a + given storage class (e.g., "STANDARD", "NEARLINE") compared to the + standard class. The cost for each class are pre-defined within the + function and could be adjusted based on regional pricing variations. + * `calculate_soft_delete_costs`: Executes Monitoring API queries and + calculates costs. + * `get_storage_class_ratio`: Fetches data on storage class + distribution within buckets. -1. Calculating Relative Soft Delete Cost: - * Fetches data on soft-deleted bytes and total byte-seconds for each bucket and storage class using the Monitoring API. - * Calculates the ratio of soft-deleted bytes to total byte-seconds, representing the relative amount of inactive data. - * Considers storage class pricing to determine the cost impact of storing this inactive data. - -2. Identifying Costly Buckets: +#### Monitoring API Queries - * Compares the calculated relative cost to a user-defined threshold. - * Flags buckets where soft delete might lead to significant cost increases. - -3. Output Options: - - * Can output a detailed JSON report with cost data for each bucket, suitable for further analysis or plotting. - * Alternatively, generates a simple list of bucket names exceeding the cost threshold. +The script relies on the Cloud Monitoring API to fetch essential data for +calculating soft delete costs. It employs the `query_client.query_time_series` +method to execute specifically crafted queries that retrieve metrics from Cloud +Storage. -#### Key Functions: +1. `calculate_soft_delete_costs` - * `soft_delete_relative_cost_analyzer`: Handles command-line input and output, calling 'get_soft_delete_cost' for the project. - - * `get_soft_delete_cost`: Orchestrates the cost analysis, using: - - * `get_relative_cost`: Retrieves the relative cost multiplier for a given storage class (e.g., "STANDARD", "NEARLINE") compared to the standard class. The cost for each class are pre-defined within the function and could be adjusted based on regional pricing variations. - * `calculate_soft_delete_costs`: Executes Monitoring API queries and calculates costs. - * `get_storage_class_ratio`: Fetches data on storage class distribution within buckets. + * This function calculates the proportion of soft-deleted data relative to + the total data volume within each bucket. The calculation is based on + the following metrics: + * `storage.googleapis.com/storage/v2/deleted_bytes`: This metric + quantifies the volume of data, in bytes, that has undergone soft + deletion. + * `storage.googleapis.com/storage/v2/total_byte_seconds`: This metric + records the cumulative byte-seconds of data stored within the + bucket, excluding objects marked for soft deletion. -#### Monitoring API Queries +2. `get_storage_class_ratio` -The script relies on the Google Cloud Monitoring API to fetch essential data for calculating soft delete costs. It employs the `query_client.query_time_series` method to execute specifically crafted queries that retrieve metrics from Google Cloud Storage. + * This function uses a query to re-acquire the + `storage.googleapis.com/storage/v2/total_byte_seconds` metric. However, + in this instance, it focuses on segregating and aggregating the data + based on the storage class associated with each object within the + bucket. + * The resultant output is a distribution of data across various storage + classes, facilitating a more granular cost analysis. For example, a + result like `{ "bucket_name-STANDARD": 0.90, "bucket_name-NEARLINE": + 0.10 }` indicates that the bucket's data is stored across two storage + classes with a ratio of 9:1. -1. `calculate_soft_delete_costs` - * This function encapsulates the most intricate query, which concurrently retrieves two metrics: - * `storage.googleapis.com/storage/v2/deleted_bytes`: This metric quantifies the volume of data, in bytes, that has undergone soft deletion. - * `storage.googleapis.com/storage/v2/total_byte_seconds`: This metric records the cumulative byte-seconds of data stored within the bucket, excluding objects marked for soft deletion. - * Subsequently, the query computes the ratio of these two metrics, yielding the proportion of soft-deleted data relative to the total data volume within each bucket. - -3. `get_storage_class_ratio` +-------------------------------------------------------------------------------- - * This function employs a less complex query to re-acquire the `storage.googleapis.com/storage/v2/total_byte_seconds` metric again.metric. However, in this instance, it focuses on segregating and aggregating the data based on the storage class associated with each object within the bucket. - * The resultant output is a breakdown elucidating the distribution of data across various storage classes, facilitating a more granular cost analysis. For example, a result like `{ "bucket_name-STANDARD": 0.90, "bucket_name-NEARLINE": 0.10 }` indicates that the bucket's data is stored across two storage classes with a ratio of 9:1. +### Key Formula -#### Key Formula - -The relative increase in cost of using soft delete is calculated by combining the output of above mentioned queries and the for each bucket, +The relative increase in cost of using soft delete is calculated by combining +the output of above mentioned queries and the for each bucket, ``` Relative cost of each bucket = deleted_bytes / total_byte_seconds @@ -122,23 +190,45 @@ Relative cost of each bucket = deleted_bytes / total_byte_seconds where, - * `Deleted Bytes`: It is same as `storage/v2/deleted_bytes`. Delta count of deleted bytes per bucket, - * `Total Bytes Seconds`: It is same as `storage/v2/total_byte_seconds`. Total daily storage in byte*seconds used by the bucket, grouped by storage class and type where type can be live-object, noncurrent-object, soft-deleted-object and multipart-upload. - * `Soft delete retention duration-seconds`: Soft Delete window defined for the bucket, this is the threshold to be provided to test out this relative cost script. - * `Relative Storage Cost`: The cost of storing data in a specific storage class (e.g., Standard, Nearline, Coldline) relative to the Standard class (where Standard class cost is 1). - * `Storage Class Ratio`: The proportion of the bucket's data that belongs to the specific storage class being considered. - -Please note the following Cloud Monitoring metrics: -`storage/v2/deleted_bytes` and `storage/v2/total_byte_seconds` are defined on [https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage](https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage) - -##### Explaination of each Steps: - -1. Soft Delete Rate: Dividing 'Deleted Bytes' by 'Total Bytes Seconds' gives you the rate at which data is being soft-deleted (per second). This shows how quickly data marked for deletion accumulates in the bucket. -2. Cost Impact: - * Multiply the `Soft Delete Rate` by the `Soft delete retention duration-seconds` to get the total ratio of data that is soft-deleted and retained within the specified period. - * Multiply this result by the 'Relative Storage Cost` to factor in the pricing of the specific storage class. - * Finally, multiply by the `Storage Class Ratio` to consider only the portion of the cost attributable to that particular class. - -The final result represents the relative increase in cost due to soft delete, expressed as a fraction or percentage. If cost is 1 then no cost increase otherwise more increase makes more cost. This allows you to assess whether the benefits of soft delete (data protection) outweigh the additional storage expenses for each bucket and storage class. Example: If the calculated relative cost increase is 0.15 (or 15%), it means that enabling soft delete for that bucket/storage class would increase your storage costs by approximately 15%. - - +* `Deleted Bytes`: It is same as `storage/v2/deleted_bytes`. Delta count of + deleted bytes per bucket. +* `Total Bytes Seconds`: It is same as `storage/v2/total_byte_seconds`. Total + daily storage in byte*seconds used by the bucket, grouped by storage class + and type where type can be live-object, noncurrent-object, + soft-deleted-object and multipart-upload. +* `Soft delete retention duration-seconds`: Soft Delete window defined for the + bucket, this is the threshold to be provided to test out this relative cost + script. +* `Relative Storage Cost`: The cost of storing data in a specific storage + class (e.g., Standard, Nearline, Coldline) relative to the Standard class + (where Standard class cost is 1). +* `Storage Class Ratio`: The proportion of the bucket's data that belongs to + the specific storage class being considered. + +Please note the following Cloud Monitoring metrics: `storage/v2/deleted_bytes` +and `storage/v2/total_byte_seconds` are defined on +[https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage](https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage) + +#### Stepwise Explanation + +1. Soft Delete Rate: Dividing 'Deleted Bytes' by 'Total Bytes Seconds' gives + you the rate at which data is being soft-deleted (per second). This shows + how quickly data marked for deletion accumulates in the bucket. +2. Cost Impact: + * Multiply the `Soft Delete Rate` by the `Soft delete retention + duration-seconds` to get the total ratio of data that is soft-deleted + and retained within the specified period. + * Multiply this result by the 'Relative Storage Cost` to factor in the + pricing of the specific storage class. + * Finally, multiply by the `Storage Class Ratio` to consider only the + portion of the cost attributable to that particular class. + +A script analyzes your bucket usage history to estimate the relative cost +increase of enabling soft delete. It outputs a fraction for each bucket, +representing the fractional increase in cost compared to current pricing if +usage patterns continue. For instance, `{"Bucket_A": 1.15,"Bucket_B": 1.05}` +indicates a `15%` price increase for `Bucket_A` and `5%` increase for `Bucket_B` +with soft delete enabled for the defined `Soft delete retention +duration-seconds`. This output allows you to weigh the benefits of data +protection against the added storage expenses for each bucket and storage class, +helping you make informed decisions about enabling soft delete.