Skip to content

Latest commit

 

History

History
46 lines (34 loc) · 2.92 KB

export_to_pubsub.md

File metadata and controls

46 lines (34 loc) · 2.92 KB

Let's imagine you have a BigQuery table that tracks real-time sensor data from a manufacturing plant. You want to trigger a downstream process whenever a sensor reading exceeds a certain threshold. This downstream process might be an alert system, an automated adjustment to the machinery, or a data pipeline for further analysis. Instead of constantly polling the BigQuery table, you can use the export_to_pubsub function to push relevant data to a Pub/Sub topic in real-time.

Here's how it would work:

  1. BigQuery Streaming Inserts: Sensor data is streamed into a BigQuery table.

  2. BigQuery Scheduled Query: A scheduled query runs periodically (e.g., every minute) to check for sensor readings exceeding the threshold. The query might look something like this:

    SELECT sensor_id, timestamp, reading
    FROM `your-project.your_dataset.sensor_data`
    WHERE reading > 1000 AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 MINUTE)
  3. Call export_to_pubsub within the query: If the query returns any results, you integrate the export_to_pubsub function within the query itself:

    SELECT bigfunctions.your_region.export_to_pubsub(
        'your-project',
        'sensor_alerts_topic',
        TO_JSON_STRING(t),  -- Send the entire row as the message data
        '{"sensor_type": "temperature"}' -- Add attributes for context
    )
    FROM (
        SELECT sensor_id, timestamp, reading
        FROM `your-project.your_dataset.sensor_data`
        WHERE reading > 1000 AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 MINUTE)
    ) AS t;
  4. Pub/Sub triggers downstream process: The sensor_alerts_topic is subscribed to by a service that handles the alerts or automated actions. When a message arrives on the topic, the subscriber is triggered, and the downstream process is initiated.

Benefits:

  • Real-time responsiveness: Alerts and actions are triggered as soon as the threshold is crossed, without needing to constantly poll BigQuery.
  • Scalability: Pub/Sub handles the message distribution, ensuring that the downstream systems can scale to handle a large volume of alerts.
  • Decoupling: BigQuery is decoupled from the downstream processes, making the system more flexible and maintainable.
  • Reduced cost: Avoids the cost and latency of repeatedly querying BigQuery for new data.

This is just one example. The export_to_pubsub function can be used in various scenarios where you need to push data from BigQuery to other systems in a real-time or near real-time fashion. Other examples include:

  • New user registration: Push new user data to a Pub/Sub topic for welcome email processing.
  • Order fulfillment: Notify downstream systems about new orders placed.
  • Fraud detection: Push suspicious transactions to a Pub/Sub topic for real-time analysis.
  • Data synchronization: Keep other systems in sync with changes happening in BigQuery.