Skip to content

Commit 338fab3

Browse files
authored
feat(dlp): Sample for inspect GCS, BigQuery, and Datastore send to scc (GoogleCloudPlatform#1867)
1 parent fee6e59 commit 338fab3

File tree

6 files changed

+866
-0
lines changed

6 files changed

+866
-0
lines changed

dlp/README.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,68 @@ This simple command-line application demonstrates how to invoke
4545

4646
See the [DLP Documentation](https://cloud.google.com/dlp/docs/inspecting-text) for more information.
4747

48+
## Testing
49+
50+
### Setup
51+
- Ensure that `GOOGLE_APPLICATION_CREDENTIALS` points to authorized service account credentials file.
52+
- [Create a Google Cloud Project](https://console.cloud.google.com/projectcreate) and set the `GOOGLE_PROJECT_ID` environment variable.
53+
```
54+
export GOOGLE_PROJECT_ID=YOUR_PROJECT_ID
55+
```
56+
- [Create a Google Cloud Storage bucket](https://console.cloud.google.com/storage) and upload [test.txt](src/test/data/test.txt).
57+
- Set the `GOOGLE_STORAGE_BUCKET` environment variable.
58+
- Set the `GCS_PATH` environment variable to point to the path for the bucket file.
59+
```
60+
export GOOGLE_STORAGE_BUCKET=YOUR_BUCKET
61+
export GCS_PATH=gs://GOOGLE_STORAGE_BUCKET/test.txt
62+
```
63+
- Set the `DLP_DEID_WRAPPED_KEY` environment variable to an AES-256 key encrypted ('wrapped') [with a Cloud Key Management Service (KMS) key](https://cloud.google.com/kms/docs/encrypt-decrypt).
64+
- Set the `DLP_DEID_KEY_NAME` environment variable to the path-name of the Cloud KMS key you wrapped `DLP_DEID_WRAPPED_KEY` with.
65+
```
66+
export DLP_DEID_WRAPPED_KEY=YOUR_ENCRYPTED_AES_256_KEY
67+
export DLP_DEID_KEY_NAME=projects/GOOGLE_PROJECT_ID/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME
68+
```
69+
- [Create a De-identify templates](https://console.cloud.google.com/security/dlp/create/template;template=deidentifyTemplate)
70+
- Create default de-identify template for unstructured file.
71+
- Create a de-identify template for structured files.
72+
- Create image redaction template for images.
73+
```
74+
export DLP_DEIDENTIFY_TEMPLATE=YOUR_DEFAULT_DEIDENTIFY_TEMPLATE
75+
export DLP_STRUCTURED_DEIDENTIFY_TEMPLATE=YOUR_STRUCTURED_DEIDENTIFY_TEMPLATE
76+
export DLP_IMAGE_REDACT_DEIDENTIFY_TEMPLATE=YOUR_IMAGE_REDACT_TEMPLATE
77+
```
78+
- Copy and paste the data below into a CSV file and [create a BigQuery table](https://cloud.google.com/bigquery/docs/loading-data-local) from the file:
79+
```$xslt
80+
Name,TelephoneNumber,Mystery,Age,Gender
81+
James,(567) 890-1234,8291 3627 8250 1234,19,Male
82+
Gandalf,(223) 456-7890,4231 5555 6781 9876,27,Male
83+
Dumbledore,(313) 337-1337,6291 8765 1095 7629,27,Male
84+
Joe,(452) 223-1234,3782 2288 1166 3030,35,Male
85+
Marie,(452) 223-1234,8291 3627 8250 1234,35,Female
86+
Carrie,(567) 890-1234,2253 5218 4251 4526,35,Female
87+
```
88+
Set the `DLP_DATASET_ID` and `DLP_TABLE_ID` environment values.
89+
```
90+
export DLP_DATASET_ID=YOUR_BIGQUERY_DATASET_ID
91+
export DLP_TABLE_ID=YOUR_TABLE_ID
92+
```
93+
- [Create a Google Cloud Datastore](https://console.cloud.google.com/datastore) kind and add an entity with properties:
94+
```
95+
96+
Person Name : John
97+
Phone Number : 343-343-3435
98+
99+
100+
Person Name : Gary
101+
Phone Number : 343-443-3136
102+
```
103+
Provide namespace and kind values.
104+
- Set the environment variables `DLP_NAMESPACE_ID` and `DLP_DATASTORE_KIND` with the values provided in above step.
105+
```
106+
export DLP_NAMESPACE_ID=YOUR_NAMESPACE_ID
107+
export DLP_DATASTORE_KIND=YOUR_DATASTORE_KIND
108+
```
109+
48110
## Troubleshooting
49111
50112
### bcmath extension missing
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
<?php
2+
/**
3+
* Copyright 2023 Google Inc.
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
/**
19+
* For instructions on how to run the samples:
20+
*
21+
* @see https://github.com/GoogleCloudPlatform/php-docs-samples/tree/main/dlp/README.md
22+
*/
23+
24+
namespace Google\Cloud\Samples\Dlp;
25+
26+
# [START dlp_inspect_bigquery_send_to_scc]
27+
use Google\Cloud\Dlp\V2\DlpServiceClient;
28+
use Google\Cloud\Dlp\V2\InfoType;
29+
use Google\Cloud\Dlp\V2\InspectConfig;
30+
use Google\Cloud\Dlp\V2\InspectConfig\FindingLimits;
31+
use Google\Cloud\Dlp\V2\StorageConfig;
32+
use Google\Cloud\Dlp\V2\Likelihood;
33+
use Google\Cloud\Dlp\V2\Action;
34+
use Google\Cloud\Dlp\V2\Action\PublishSummaryToCscc;
35+
use Google\Cloud\Dlp\V2\BigQueryOptions;
36+
use Google\Cloud\Dlp\V2\BigQueryTable;
37+
use Google\Cloud\Dlp\V2\InspectJobConfig;
38+
use Google\Cloud\Dlp\V2\DlpJob\JobState;
39+
40+
/**
41+
* (BIGQUERY) Send Cloud DLP scan results to Security Command Center.
42+
* Using Cloud Data Loss Prevention to scan specific Google Cloud resources and send data to Security Command Center.
43+
*
44+
* @param string $callingProjectId The project ID to run the API call under.
45+
* @param string $projectId The ID of the Project.
46+
* @param string $datasetId The ID of the BigQuery Dataset.
47+
* @param string $tableId The ID of the BigQuery Table to be inspected.
48+
*/
49+
function inspect_bigquery_send_to_scc(
50+
// TODO(developer): Replace sample parameters before running the code.
51+
string $callingProjectId,
52+
string $projectId,
53+
string $datasetId,
54+
string $tableId
55+
): void {
56+
// Instantiate a client.
57+
$dlp = new DlpServiceClient();
58+
59+
// Construct the items to be inspected.
60+
$bigqueryTable = (new BigQueryTable())
61+
->setProjectId($projectId)
62+
->setDatasetId($datasetId)
63+
->setTableId($tableId);
64+
$bigQueryOptions = (new BigQueryOptions())
65+
->setTableReference($bigqueryTable);
66+
67+
$storageConfig = (new StorageConfig())
68+
->setBigQueryOptions(($bigQueryOptions));
69+
70+
// Specify the type of info the inspection will look for.
71+
$infoTypes = [
72+
(new InfoType())->setName('EMAIL_ADDRESS'),
73+
(new InfoType())->setName('PERSON_NAME'),
74+
(new InfoType())->setName('LOCATION'),
75+
(new InfoType())->setName('PHONE_NUMBER')
76+
];
77+
78+
// Specify how the content should be inspected.
79+
$inspectConfig = (new InspectConfig())
80+
->setMinLikelihood(likelihood::UNLIKELY)
81+
->setLimits((new FindingLimits())
82+
->setMaxFindingsPerRequest(100))
83+
->setInfoTypes($infoTypes)
84+
->setIncludeQuote(true);
85+
86+
// Specify the action that is triggered when the job completes.
87+
$action = (new Action())
88+
->setPublishSummaryToCscc(new PublishSummaryToCscc());
89+
90+
// Configure the inspection job we want the service to perform.
91+
$inspectJobConfig = (new InspectJobConfig())
92+
->setInspectConfig($inspectConfig)
93+
->setStorageConfig($storageConfig)
94+
->setActions([$action]);
95+
96+
// Send the job creation request and process the response.
97+
$parent = "projects/$callingProjectId/locations/global";
98+
$job = $dlp->createDlpJob($parent, [
99+
'inspectJob' => $inspectJobConfig
100+
]);
101+
102+
$numOfAttempts = 10;
103+
do {
104+
printf('Waiting for job to complete' . PHP_EOL);
105+
sleep(10);
106+
$job = $dlp->getDlpJob($job->getName());
107+
if ($job->getState() == JobState::DONE) {
108+
break;
109+
}
110+
$numOfAttempts--;
111+
} while ($numOfAttempts > 0);
112+
113+
// Print finding counts.
114+
printf('Job %s status: %s' . PHP_EOL, $job->getName(), JobState::name($job->getState()));
115+
switch ($job->getState()) {
116+
case JobState::DONE:
117+
$infoTypeStats = $job->getInspectDetails()->getResult()->getInfoTypeStats();
118+
if (count($infoTypeStats) === 0) {
119+
printf('No findings.' . PHP_EOL);
120+
} else {
121+
foreach ($infoTypeStats as $infoTypeStat) {
122+
printf(
123+
' Found %s instance(s) of infoType %s' . PHP_EOL,
124+
$infoTypeStat->getCount(),
125+
$infoTypeStat->getInfoType()->getName()
126+
);
127+
}
128+
}
129+
break;
130+
case JobState::FAILED:
131+
printf('Job %s had errors:' . PHP_EOL, $job->getName());
132+
$errors = $job->getErrors();
133+
foreach ($errors as $error) {
134+
var_dump($error->getDetails());
135+
}
136+
break;
137+
case JobState::PENDING:
138+
printf('Job has not completed. Consider a longer timeout or an asynchronous execution model' . PHP_EOL);
139+
break;
140+
default:
141+
printf('Unexpected job state. Most likely, the job is either running or has not yet started.');
142+
}
143+
}
144+
# [END dlp_inspect_bigquery_send_to_scc]
145+
// The following 2 lines are only needed to run the samples
146+
require_once __DIR__ . '/../../testing/sample_helpers.php';
147+
\Google\Cloud\Samples\execute_sample(__FILE__, __NAMESPACE__, $argv);
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
<?php
2+
/**
3+
* Copyright 2023 Google Inc.
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
/**
19+
* For instructions on how to run the samples:
20+
*
21+
* @see https://github.com/GoogleCloudPlatform/php-docs-samples/tree/main/dlp/README.md
22+
*/
23+
24+
namespace Google\Cloud\Samples\Dlp;
25+
26+
# [START dlp_inspect_datastore_send_to_scc]
27+
use Google\Cloud\Dlp\V2\DlpServiceClient;
28+
use Google\Cloud\Dlp\V2\InfoType;
29+
use Google\Cloud\Dlp\V2\InspectConfig;
30+
use Google\Cloud\Dlp\V2\InspectConfig\FindingLimits;
31+
use Google\Cloud\Dlp\V2\StorageConfig;
32+
use Google\Cloud\Dlp\V2\Likelihood;
33+
use Google\Cloud\Dlp\V2\Action;
34+
use Google\Cloud\Dlp\V2\Action\PublishSummaryToCscc;
35+
use Google\Cloud\Dlp\V2\DatastoreOptions;
36+
use Google\Cloud\Dlp\V2\InspectJobConfig;
37+
use Google\Cloud\Dlp\V2\KindExpression;
38+
use Google\Cloud\Dlp\V2\PartitionId;
39+
use Google\Cloud\Dlp\V2\DlpJob\JobState;
40+
41+
/**
42+
* (DATASTORE) Send Cloud DLP scan results to Security Command Center.
43+
* Using Cloud Data Loss Prevention to scan specific Google Cloud resources and send data to Security Command Center.
44+
*
45+
* @param string $callingProjectId The project ID to run the API call under.
46+
* @param string $kindName Datastore kind name to be inspected.
47+
* @param string $namespaceId Namespace Id to be inspected.
48+
*/
49+
function inspect_datastore_send_to_scc(
50+
string $callingProjectId,
51+
string $kindName,
52+
string $namespaceId
53+
): void {
54+
// Instantiate a client.
55+
$dlp = new DlpServiceClient();
56+
57+
// Construct the items to be inspected.
58+
$datastoreOptions = (new DatastoreOptions())
59+
->setKind((new KindExpression())
60+
->setName($kindName))
61+
->setPartitionId((new PartitionId())
62+
->setNamespaceId($namespaceId)
63+
->setProjectId($callingProjectId));
64+
65+
$storageConfig = (new StorageConfig())
66+
->setDatastoreOptions(($datastoreOptions));
67+
68+
// Specify the type of info the inspection will look for.
69+
$infoTypes = [
70+
(new InfoType())->setName('EMAIL_ADDRESS'),
71+
(new InfoType())->setName('PERSON_NAME'),
72+
(new InfoType())->setName('LOCATION'),
73+
(new InfoType())->setName('PHONE_NUMBER')
74+
];
75+
76+
// Specify how the content should be inspected.
77+
$inspectConfig = (new InspectConfig())
78+
->setMinLikelihood(likelihood::UNLIKELY)
79+
->setLimits((new FindingLimits())
80+
->setMaxFindingsPerRequest(100))
81+
->setInfoTypes($infoTypes)
82+
->setIncludeQuote(true);
83+
84+
// Specify the action that is triggered when the job completes.
85+
$action = (new Action())
86+
->setPublishSummaryToCscc(new PublishSummaryToCscc());
87+
88+
// Construct inspect job config to run.
89+
$inspectJobConfig = (new InspectJobConfig())
90+
->setInspectConfig($inspectConfig)
91+
->setStorageConfig($storageConfig)
92+
->setActions([$action]);
93+
94+
// Send the job creation request and process the response.
95+
$parent = "projects/$callingProjectId/locations/global";
96+
$job = $dlp->createDlpJob($parent, [
97+
'inspectJob' => $inspectJobConfig
98+
]);
99+
100+
$numOfAttempts = 10;
101+
do {
102+
printf('Waiting for job to complete' . PHP_EOL);
103+
sleep(10);
104+
$job = $dlp->getDlpJob($job->getName());
105+
if ($job->getState() == JobState::DONE) {
106+
break;
107+
}
108+
$numOfAttempts--;
109+
} while ($numOfAttempts > 0);
110+
111+
// Print finding counts.
112+
printf('Job %s status: %s' . PHP_EOL, $job->getName(), JobState::name($job->getState()));
113+
switch ($job->getState()) {
114+
case JobState::DONE:
115+
$infoTypeStats = $job->getInspectDetails()->getResult()->getInfoTypeStats();
116+
if (count($infoTypeStats) === 0) {
117+
printf('No findings.' . PHP_EOL);
118+
} else {
119+
foreach ($infoTypeStats as $infoTypeStat) {
120+
printf(
121+
' Found %s instance(s) of infoType %s' . PHP_EOL,
122+
$infoTypeStat->getCount(),
123+
$infoTypeStat->getInfoType()->getName()
124+
);
125+
}
126+
}
127+
break;
128+
case JobState::FAILED:
129+
printf('Job %s had errors:' . PHP_EOL, $job->getName());
130+
$errors = $job->getErrors();
131+
foreach ($errors as $error) {
132+
var_dump($error->getDetails());
133+
}
134+
break;
135+
case JobState::PENDING:
136+
printf('Job has not completed. Consider a longer timeout or an asynchronous execution model' . PHP_EOL);
137+
break;
138+
default:
139+
printf('Unexpected job state. Most likely, the job is either running or has not yet started.');
140+
}
141+
}
142+
# [END dlp_inspect_datastore_send_to_scc]
143+
// The following 2 lines are only needed to run the samples
144+
require_once __DIR__ . '/../../testing/sample_helpers.php';
145+
\Google\Cloud\Samples\execute_sample(__FILE__, __NAMESPACE__, $argv);

0 commit comments

Comments
 (0)