mongodb
diff --git a/‎content/manual/upcoming/source/data-modeling/design-patterns/group-data.txt
Lines changed: 1 addition & 0 deletions b/‎content/manual/upcoming/source/data-modeling/design-patterns/group-data.txt
Lines changed: 1 addition & 0 deletions
diff --git a/‎content/manual/upcoming/source/data-modeling/design-patterns/group-data/subset-pattern.txt
Lines changed: 170 additions & 0 deletions b/‎content/manual/upcoming/source/data-modeling/design-patterns/group-data/subset-pattern.txt
Lines changed: 170 additions & 0 deletions
@@ -71,3 +71,4 @@ Learn More
    Bucket Pattern </data-modeling/design-patterns/group-data/bucket-pattern>
    Outlier Pattern </data-modeling/design-patterns/group-data/outlier-pattern>
    Attribute Pattern </data-modeling/design-patterns/group-data/attribute-pattern>
+   Subset Pattern </data-modeling/design-patterns/group-data/subset-pattern>
@@ -0,0 +1,170 @@
+.. _group-data-subset-pattern:
+
+================================== 
+Group Data with the Subset Pattern
+==================================
+
+.. meta::
+   :description: Improve query access and minimize working by using the subset pattern to group highly accessed data into one collection and less frequently accessed data into another collection. 
+
+.. contents:: On this page
+   :local:
+   :backlinks: none
+   :depth: 1
+   :class: singlecol
+
+MongoDB keeps frequently accessed data, referred to as the 
+:term:`working set`, in RAM. When the working set of data 
+and indexes grows beyond the physical RAM allotted, performance 
+is reduced as disk accesses starts to occur and data is no longer retrieved from RAM.
+
+To solve this problem, you can shard your collection. However, 
+sharding can create additional costs and complexities that your 
+application may not be ready for. Rather than sharding your collection, 
+you can reduce the size of your working set by using the subset pattern. 
+
+The subset pattern is a data modeling technique used to handle 
+scenarios where you have a large array of items within a document, 
+but need to access frequently a small subset of those items. 
+In this case, the document size can often cause the working set to exceed 
+the computer's RAM capacities. The subset pattern helps optimize performance by reducing 
+the amount of data that needs to be read from the database for common queries.
+
+About this Task 
+---------------
+
+Consider an e-commerce site that has a list of reviews for a product, stored in a 
+collection called ``products``. The e-commerce site inserts
+documents with the following schema into the ``products`` collection: 
+
+.. code-block:: javascript
+
+   db.collection('products').insertOne( [
+      {
+         _id: ObjectId("507f1f77bcf86cd99338452"),
+         name: "Super Widget",
+         description: "This is the most useful item in your toolbox."
+         price: { value: NumberDecimal("119.99"), currency: "USD" },
+         reviews: [
+            {
+              review_id: 786,
+              review_author: "Kristina",
+              review_text: "This is indeed an amazing widgt.",
+              published_date: ISODate("2019-02-18")
+            },
+            {
+              review_id: 785,
+              review_author: "Trina",
+              review_text: "Very nice product, slow shipping.",
+              published_date: ISODate("2019-02-17")
+            },
+            [...],
+            {
+              review_id: 1,
+              review_author: "Hans",
+              review_text: "Meh, it's ok.",
+              published_date: ISODate("2017-12-06")
+            }
+         ]  
+      }
+   ] )
+
+When accessing a product’s data, you likely only need the most recent reviews. 
+The following procedure demonstrates how to apply the subset pattern to the above schema. 
+
+Steps
+-----
+
+.. procedure::
+   :style: normal
+
+   .. step:: Identify the subset of frequently accessed data.
+
+      In an array field containing information about a document, determine the subset of 
+      information you need to access the most. For example, in the ``products`` 
+      collection, you might only need to access the ten most recent reviews. 
+
+   .. step:: Separate the subset into different collections.
+
+      Instead of storing all the reviews with the product, split your collection 
+      into two collections: one for your most accessed data, and one for your least 
+      accessed data. This allows for quick access to the most relevant data without 
+      having to load the entire array. 
+      
+      The first collection, the ``products`` collection, contains the 
+      most frequently used data, such as current reviews:
+
+      .. code-block:: javascript
+
+         db.collection('products').insertOne( [
+           {
+             _id: ObjectId("507f1f77bcf86cd99338452"),
+             name: "Super Widget",
+             description: "This is the most useful item in your toolbox."
+             price: { value: NumberDecimal("119.99"), currency: "USD" },
+             reviews: [
+               {
+                 review_id: 786,
+                 review_author: "Kristina",
+                 review_text: "This is indeed an amazing widget.",
+                 published_date: ISODate("2019-02-18")
+               },
+               [...],
+               {
+                 review_id: 776,
+                 review_author: "Pablo",
+                 review_text: "Amazing!",
+                 published_date: ISODate("2019-02-15")
+               }
+             ]
+           }
+         ] )
+
+      The ``products`` collection only contains the ten most recent reviews. 
+      This reduces the working set by only loading in a portion, or a subset, of the overall data. 
+
+      The second collection, the ``reviews`` collection, contains less frequently used data, such as old reviews:
+
+      .. code-block:: javascript
+
+         db.collection('review').insertOne( [
+            {
+                review_id: 786,
+                review_author: "Kristina",
+                review_text: "This is indeed an amazing widget.",
+                product_id: ObjectId("507f1f77bcf86cd99338452"),
+                published_date: ISODate("2019-02-18")
+              },
+              {
+                review_id: 785,
+                review_author: "Trina",
+                review_text: "Very nice product, slow shipping.",
+                product_id: ObjectId("507f1f77bcf86cd99338452"),
+                published_date: ISODate("2019-02-17")
+              },
+              [...],
+              {
+                review_id: 1,
+                review_author: "Hans",
+                review_text: "Meh, it's ok.",
+                product_id: ObjectId("507f1f77bcf86cd99338452"),
+                published_date: ISODate("2017-12-06")
+              }
+          ] )
+      
+      You can access the ``reviews`` collection whenever you need to see additional 
+      reviews. When considering where to split your data, store the most used fields 
+      of your documents in your main collection and the less frequently used data in a new collection. 
+
+Results 
+-------
+
+By using smaller documents with more frequently accessed data, you reduce the overall size 
+of the working set. This allows for shorter disk access times for the most frequently used 
+information that your application needs.
+
+.. note:: 
+
+   The subset pattern requires you to manage two collections, rather than one, as well as query
+   multiple databases when you need to gather comprehensive information on a document, rather than 
+   the subset.